Sequence in temporal memory

If we have 2048 columns and 32 cells per column, then whats is the max pattern/sequence length that TM can learn?

4 Likes

Math might be wrong, please don’t take my words for it. Please use monti-carlo cimulation

The math is complicated. Assuming your input SDR only have 1 on bit, every column of the TM can learn 32 distinct ways for it to be on.

Consider a simpler case. We have a TM with 4 columns and 2 cells per column (and the same assumption of every SDR can only have 1 bit on). Then every column can have 2 ways for it to be on. So the sequences A-B-C-D, A-A-C-D can be learned but A-A-A-C can’t.

For a more real world cases, we assume bits in SDRs are evenly distributed with a 4% density and a sequence length of 16 (a random value I came up). Then we can calculate that on average it will take 800 different SDRs to have a bit turned on 32 times and a ~50% of any bit overlapping with another in the sequence (on average, a bit will be on 1.5 times in the sequence if it has been on). That gives us ~533 distinct sequence.

I haven’t take the effect of activation threshold and other parameters into account. The actual number is way larger. The number I calculate is the worst case

4 Likes

I think at least, we have 2048^32 different sequence! but I’m not sure. I have the same question as you have.

1 Like

But the SDR has to have a certain number on in the same 40 micrometer neighborhood to be effective.
This smoothly morphs into the next to adjacent SDR neighborhoods so this is not just the one-of-n that is suggested in the basic SDR sparse math. Topology matters.

And all of this ignores the lateral voting between cells as an AND gate. The level of the SDR has to be taken into the context of the total cell response to the incoming constellation of excitation.

4 Likes

How many minicolumns per input (or worded another way, what is the column sparsity?). As @marty1885 mentioned, the activation threshold is also extremely important when talking about capacity.

This matters, because you can reuse the same 32 cells in each minicolumn in different combinations per input. For example, you could create a new represenation A(33) by taking one cell from A(1), one from A(2), one from A(3) etc. to create a new unique representation for A which only overlaps the previous representations by one cell. If the activation threshold is more than one, then this overlap is too small to give you any false positives.

The capacity of a TM layer with 2048 minicolumns is highly dependent on your configuration parameters, but with the right sparsity, activation threshold, and diversity of inputs, it is astronomically high.

2 Likes

In the ideal scenario, HTM capacity = (n_cols!)/(n_wins! x (n_cols - n_wins)!)
n_cols = total number of mini-columns in the HTM region
n_wins = number of winning mini-columns

1 Like

In case anyone missed it, we talked about HTM sequence capacity in the Neuron Paper.

It says:

In our model it is straightforward to obtain an estimate of sequence capacity. Although we refer to the network model as a “sequence memory,” it is actually a memory of transitions. There is no representation or concept of the length of sequences or of the number of stored sequences. The network only learns transitions between inputs. Therefore, the capacity of a network is measured by how many transitions a given network can store. This can be calculated as the product of the expected duty cycle of an individual neuron (cells per column/column sparsity) times the number of patterns each neuron can recognize on its basal dendrites.

4 Likes

To relate this back to my earlier post, the “number of patterns each neuron can recognize” is highly dependent (by many orders of magnitude) on a couple of configuration parameters in particular - the number of minicolumns per input (also referred to on this thread as “number of winning mini-columns” or “column sparsity”) and the activation threshold.

6 Likes