Why is the implementation of the temporal memory algorithm so complicated?

We can do this with just two vectors, follow the steps below:

  1. Create two vectors to store SDR.
  2. Make a full connection to the elements of these two vectors
  3. Update the weight of connection each time the SDR changes.

These two vectors can store the transition of the pattern and can be used for prediction.
:thinking: :question:

It’s like why we don’t just use FullyConnected/Dense layers in neural networks. Its

  1. Using too much memory
  2. Too slow - O(N^2)

Also, You’ll a 2D connection list isn’t enough. You’ll need a 3D one to make sequence learning to work. Otherwise the algorithm can’t learn any concept about time and acts like a simple a->b mapping algorithm.

1 Like

algorithm optimization can be done later. i think spatial pooler algorithm is slower than O(N^2).

the algorithm can’t learn any concept about time and acts like a simple a->b mapping

what dose this mean?

One reason is that in most applications, the inputs are not all semantically dissimilar. For example if TM algorithm learned transition A - B’. Then later an input came in which shared 25% of the same bits as A. I would expect a weak prediction (depending on the configuration) whereby some (but not all) of the cells representing B’ become predictive.

1 Like

yes, this could be one reason.

Your 3-step TM algorithm looks simple which is good, but I honestly do not follow as to how it would satisfy the TM computational/programming requirements. Could you explain more?

My understanding is that TM inputs are active columns that are set up by the SP, they don’t directly care that much about SDRs.

i think active columns = On bits of SDR, both inputs/outputs of HTM region are SDR.

In think you meant on bits of the input space. Not all columns that contain a subset of the SDR bits in their receptive fields become active. If they were then the TM algorithm would be simpler. I guess I know how you got your 3-step algorithm.

I think the TM alorithm is not complicated, if one compares this to other learning algorithms out there. It is straight forward however it is a bit hard to make a mental picture of it because the sequence learning is done in a distributed manner.

no, input space is not always SDR. Do you know how SP works? SP generates SDR.

Implementation-wise, the SDR is just a concept of what the HTM prefers to represent its inputs. Inputs are encoded as SDRs using specific SDR encoders. The SP I believe is yet another concept, but implementation-wise it maps columns to input bits in the input space in groups called receptive fields. An input bit may or may not fall into a column’s receptive field this also means that if a column gets active then that is because it has “seen a set of bits” that overlaps the input (represented as an SDR). This set in practice is almost always a subset of bits not a superset. So an active column is really carrying the information of “I activate when I see this pattern from my receptive field”.

Going back to your 3-step algo, how does sequence learning work in that simple algo?

If you don’t know what I am asking, please don’t reply. thank you.

@OhMyHTM This looks to me like @Jose_Cueto is trying to understand your points and asking you counter questions. I also don’t quite understand the point you are trying to make. You’re saying the input space doesn’t have to be sparse, so may not technically be an SDR, and I can see from Jose’s response he also understands that, too. But you are claiming (it seems to me) that the TM has too many steps, and could be simplified. Correct? If so, that’s big news.


Maybe this is the confusion. It’s not really two vectors, but a vector of vectors. From one neuron’s standpoint, it has X dendritic segments, each having a unique number of synapses. This is more complex than just two vectors.

Does that help clear up our point of view?


My apologies for the confusion. Was trying to clarify and test my understanding of HTM in the implementation point-of-view. Most importantly simplifying the TM impl I believe would be a big code improvement.


If you are confused, I can write a demo later. i think these two vectors can store the transition of the pattern and can be used for prediction.


here is my idea.

simple, right?:grinning:
the only thing that needs to be done is to maintain a suitable vector capacity.

What do each of those bit strings represent? It seems quite a feat to get the TM down to two vectors, since there are so many layers of data structures:

Each SP mini-column contains a list of neurons, each of which contains a list of dendrite segments, each of which contains a list of synapses to specific cells (each synapse with a continually updating permanence value). I’m curious how simple a data structure can be while maintaining all of this info.


How would the SP algorithm reach anything close to O(n^2)?

There is a limited number of dendrites with a limited number of synapses. The dendrites are evaluated once per exposure (unless something very different has been introduced into the theory). Training (actual training and boosting) is done once per dendrite per exposure.

This gives O(2n) -> O(n) since there is nothing but linear referencing of the dendrites.