The Principle of Temporal Memory
I’ve been so busy researching and testing so many things that I forgot to share my greatest INSIGHT about TM.
What is it ? It is the basic principle on which TM works !
Let me state it first and then I will elaborate.
Temporal Memory works by UPSCALING the input when learning (storing it in higher dimensional memory ) and then DOWNSCALING when predicting
This insight helped me to implement TM without implementing exact models of HTM-neurons with dendrites and synapses.
In essence TM at its simplest crude abstraction can be thought as higher dimensional UNION of smaller interconnected SDR’s, sort of.
The whole process of bursting and the rest can be implemented as UPSCALING the input SDR in intermediary higher dimensional buffer and storing the result in the Higher dimensional TM.
Then when predicting we can use another buffer to DOWNSCALE from the higher dimensional version stored in TM.
Why it works ?
The reason it works is because as we know the SDR capacity grows exponentially as the SDR size grows linearly.
Storing single 2000/2% bit SDR in 20row/2000cols i.e. 40000 bit SDR makes the sparsity drop to 0.05%.
So if we do it orderly we can pack many SDR’s. In addition in the original TM to activate bit requires 8-20 synapses, instead of 40. Plus the dependency allows to reuse neurons in patterns.
TM without SDR’s !!!
This opens the possibilities to implement TM with other data representations.
Integers
For example using pure Integer values ! Yes, thats what I said TM with simple Integers
This is the second insight you will get in a single day :), there is third.
What capability of integers we can use so that we can implement UPSCALING and DOWNSCALING ?
There is this operation called modulo, using it we can divide the Integer in ranges.
what is the representational capacity of Integer using modulo ?
Here is quick calculation for different integer types :
uint16 : 65535 i.e. modulo 10 => ~6500 distinct up-scalings for 10 distinct items/categories/classes
uint32 : np.iinfo(np.uint32).max / 10000 => ~430_000 up-scalings for representing 10_000 distinct items
uint64 : 1.8e15 up-scalings for representing 10_000 distinct items
Keep in mind that mlp NN can’t handle classifying ~100s of classes|categories.
BTW if you don’t use scaling you can use this idea to build a classifier, instead of TM
So how do we use this (Ex. using mod 10).
Here is how we can record multiple sequences.
1,2,3,4,5 => 11,12,13,14,15
6,2,7,1,2 => 16,22,17,21,32
7,3,9,1,2 => 27,23,19,31,42
first we use base-num + 10, then in every new encounter of the same value we bump it with 10.
f.e. first 2 become 12, the second encounter of 2 in any sequence becomes 22 … and so on…
Why we encode it is this way ? Lets play it.
-
if we start with 6 we will latch onto 16 /which will carry us over the second sequence/
then 16 will predict 22, if we get as a second value 2 we predicted correctly and will use 22 to predict 17 i.e. 7.
You see if the first value was 1 i.e. 11 we would expect second value to be 12 not 22, this means first sequenceThe reason we have buffers is to keep the upscaled prediction, so that we don’t lose track.
-
if we start as a first value of 2 we have to choose between 4 options to choose from (a second visits/probability array may help us choose.)
We do it this way to save up-scalings and to avoid to make a decision if a number repeats (look at 22 second sequence), otherwise we could have done it this way, but now we will sometimes lose track :
1,2,3,4,5 => 11,12,13,14,15
6,2,7,1,2 => 26,22,27,21,22
7,3,9,1,2 => 37,33,39,31,32
We can store the sequence (the former format) in the array and that is equivalent to SDR TM. Cute huh
Of course we can also be more like TM and store the transitions instead.
seq2 :: 16:22,22:17,17:21,21:32
We can also mimic integer neurons. What predicts 2, the list plays the role of pack of synapses.
2 <= 11,21,31,16
Vectors
We can also use good old vectors for more complex data, where the first element is the upscale-index.
1,3,3 => <1,1>,<1,3>,<2,3>
grid location seq : <1,(1,1)>,<1,(1,2)>
grid loc + cmd seq : <1,(1,1,up)>,<1,(1,2,down)>,<2,(1,1,end)>
SA Transitions : <1,(1,1,start)>:<1,(1,1,up)>, <1,(1,1,up)>:<1,(1,2,down)>, <1,(1,2,down)>:<2,(1,1,end)>
to be continued …