My analysis on why Temporal Memory prediction doesn't work on sequential data

I’m thinking I do still have a bug (or I just need to go back to the basics and refresh my memory), since now I would expect to see the behavior that @oiegorov described in the OP (i.e. representations never fully stabilizing for all elements in a repeating sequence)

Thank you for your response. Do you confirm that the problem I described in the OP is valid?

Also, what exactly prevents the highlighted cells in step 5 to grow synapses to cell #13 in column 1 and cell # 11 in column 2 if we had maxSegmentsPerCell = 1, maxSynapsesPerSegment = 4, maxNewSynapseCount = 4? Wouldn’t the highlighted cells try to grow 2 more synapses? I’m not sure about that cause I don’t remember seing any discussion about the necessity of growing additional synapses for a cell that was correctly predicted…

You’re correct that the TM handles repeating sequences poorly, by default. The only immediately available solution is to use resets. The Backtracking TM addresses this problem by, upon bursting, asking, “Would this have bursted if the sequence had ‘started’ more recently?”, although I can’t say for certain whether it handles this flawlessly. Another imperfect approach I’ve used is change the “winner cell” selection process so that it selects the same cell within the minicolumn every time the minicolumn bursts, within a limited timespan. The timespan would need to be at least as long as the repeating sequence.

You’re correct that if maxNewSynapseCount (which is poorly named – elsewhere I call this “sampleSize”) is greater than the number of active minicolumns (and hence the number of winner cells), then in Step 5 it will grow 2 more synapses, connecting A’’ -> B’. But if this sample size is ≤ the number of active minicolumns, then it won’t connect A’’ -> B’. Typically the sample size is less than the number of active minicolumns – otherwise it wouldn’t really be “subsampling”, and it would have the ABCD / XBCY problem that I mentioned.


I found the bug. I had written a loophole where a random sampling of up to maxNewSynapseCount previously active cells (which may not include the current connected cells) could end up forming connections to the currently active cells. I believe this is also the cause for my aforementioned issue with multiple cells in the same column representing the same context.

1 Like

Thank you for the clarification! So backtracking is used is to replace resetting?
I found out that NAB uses backtracking by default. Does it still use resetting? if yes, does it reset the sequence when a new week starts?


I hope this isn’t going off topic (I think it is relevant to the topic), this particular goof may hint at a possible direction to explore for stabilizing the representations in a repeating sequence. If the system is allowed to make a smaller number of additional connections (beyond maxNewSynapseCount) for some of the current winning cells (some number of them above activationThreshold) to a random sampling of the winning cells from T-1 that it isn’t already connected with, then the representations would stabilize after the second time through the repeated sequence.

This would then lead to the case you mentioned of ambiguity for the C in ABCD vs XBCY. However, this implementation would result in duplicate cells in a sub-set of the C columns for the C in XBCY. One of the duplicates would be the same cell from ABCD and would be connected more weakly than the other duplicate which is unique to XBCY. The learning step could be tweaked to degrade the weaker of the two, which would eliminate the ambiguity.

It is an interesting idea. I’ll have to think it out further, and I’ll let you know what I learn from it.

1 Like

Yeah, backtracking can act as a replacement for resets. I haven’t spent much time considering how effectively it replaces resets, but it definitely helps.

The “Numenta HTM” NAB detector uses backtracking, and it doesn’t use resets. The same is true of HTM Studio, HTM for Stocks, and it’s what we used in Grok. You can think of the Backtracking TM as a productized version of the Temporal Memory. It’s the pure algorithm, plus some non-biological stuff.

One quick note: the NAB README has a second table of results of different HTM variations. In that table, the “NumentaTM HTM” row is pure temporal memory, without backtracking. You can see that it does okay, but it’s better with backtracking.


That’s so interesting! Might I ask roughly how to implement Backtracking in Nupic? May there be any examples? I’m using it for my thesis and would be very curious to see if/how it might effect my results. Thanks again for all your guys thought leadership and guidance, I’m loving this thread

Here is the code for it.


Direct link to the backtracking code:

The doc string is pretty thorough


thank you for the explanations!
if I understand correctly, NumentaTM HTM is used when we pass “-d numentaTM” to the NAB’s
I can see in that it assigns tmImplementation=“tm_cpp” which makes NAB use the compute() method from So, NumentaTM HTM still uses backtracking?

As I mentioned in the OP, I couldn’t find a way to use the “pure” TM implementation…

1 Like

The wraps the pure TM. It doesn’t use the Backtracking TM. It mimics the interface of the Backtracking TM. Here’s the line where this class wraps the pure TM:

1 Like

ok, I see. Thank you!

I have to say that I am uncomfortable with backtracking and resetting when comparing to the biological cortex.
Considering that this model is supposed to be based on what the cortex is doing how is this acceptable?
I expect that the degree of prediction is not arbitrarily long and that the flow of information up and down the hierarchy should be seeding the neighborhood of a column with constantly updated samples of motion and sensation.

Is there anywhere where the model is given a “performance specification” that is based on testable predictions against real cortex?

More plainly - how well do we expect this to work to say that it is like the real thing? We have lots of artifacts in the human neural system that are not what an engineer might say are ideal - could this merging of time sequences be part of how cortex works?


We did not move ahead with this model for research. All ongoing work, including recent sensorimotor models, do not include this backtracking. It was only to optimize applications and tests we were building for anomaly detection years ago.

I think the solution to this problem requires answers about how attention works, and we are still trying to lay out the groundwork for object representation without attention. Attention must come soon, but then we start talking about behavior. And then it gets really interesting.

1 Like

Can someone mention the key-differences between backtracking and classic Temporal Memory algorithms?

Hi all,

So I’m trying to implement the BacktrackingTM in place of the standard TM within the opf for comparison. I was able to import the BacktrackingTM into my ‘’ file, though I’m having trouble finding what exactly I should modify in the code to actually implement it.

The model type is ‘HTMprediction’ and the inference type is ‘TemporalAnomaly’, as set in the model_params file. In the iPython notebook walkthrough there’s point where tm = BacktrackingTM(…params…), though I don’t see an equivalent within the ‘…opf/clients/hotgym/anomaly/one_gym’ files I’m using. I tried looking in the ‘’ and ‘’ files as well in case the change should be there, though they’re both read-only.

I also noticed this from a prior post, though I’m having trouble finding ‘tmImplementation’ within either the run or params file.

Any advice?? Thanks again!

Here’s a note from our model param docs that explains:

So the backtracking tm is the default.

Ok great, so long as

‘temporalImp’: ‘cpp’

the BacktrackingTM is in place, right? Last question on this, is there another ‘temporalimp’ that would implement the original (non-Backtracking) TM?

Thanks again