Testing TM implementations?

I was wondering how to test for “correct” Temporal Memory behavior. What I mean by that is how I can be sure that an implementation of TM works as expected.
Here are my thoughts, let me know what you think and if you have other ways to test TM.

OK. Here is the premise : Every state of TM (STATE is all the active and/or predicted cells in TM at any given point in time) represent the full sequence up to the current moment. Also the current State should be unique across all states, because every new time step is new sequence.

What this means is when I train a TM region, I should not get the same STATE twice.

And this is my test for correct behavior of TM.

Is this fair assessment ? Or is there some rules that can loosen this requirement !

1 Like

There are definitely cases where we want the same state repeated multiple times. The Temporal Memory is not trying to uniquely represent the in put given its entire history. Instead, it is trying to uniquely represent the input in the current learned sequence. If the sequence isn’t learned, then the cells in the active columns burst. And if the sequence is learned, then the representation should be the same each time that sequence is seen. This is a really important property since it allows us to distinguish between a pattern in different learned contexts.

I’d recommend studying the papers that describe the function of the Temporal Memory to better understand the specific behavior.

But then the question is if it is not full history which “learned sequence” , the last 5 steps, the last 7 steps, the last X steps ?
Yes the column bursts, but you settle on one cell of the column, isn’t it ? And active cells in case of bursting will still represent unique STATE with the last step as unknown.

I’m going by : http://blog.agi.io/2014/04/temporal-pooler-theory.html
which seem to better explain theoretical innards of TM than the papers.

It seem hard for me to grasp how can you represent partial “learned sequence” (and there have to be many of them, sometimes overlapping) by storing only single-step-transitions.
I’m talking more of the problem of detection and assignment of all the combination of partial sequences to state SDR, rather than the ability to represent them by a state SDR.

But then the question is if it is not full history which “learned sequence” , the last 5 steps, the last 7 steps, the last X steps ?

The TM is variable order so how many steps will depend on the sequences it has learned and how many times it has seen them.

Yes the column bursts, but you settle on one cell of the column, isn’t it ? And active cells in case of bursting will still represent unique STATE with the last step as unknown.

This representation will appear any time the same input values are seen in a previously-unlearned context. So the representation will be the same even though the context may be different.

I’m going by : http://blog.agi.io/2014/04/temporal-pooler-theory.html which seem to better explain theoretical innards of TM than the papers.

It seem hard for me to grasp how can you represent partial “learned sequence” (and there have to be many of them, sometimes overlapping) by storing only single-step-transitions.
I’m talking more of the problem of detection and assignment of all the combination of partial sequences to state SDR, rather than the ability to represent them by a state SDR.

If you are still struggling with some of the concepts after reading that web site then perhaps the Numenta papers would still be useful. The HTM White Paper is older but good for understanding the Temporal Memory. The HTM Neuron paper is good for a more detailed understanding.

Also keep in mind that the Temporal Memory is only part of the picture. We have done some experiments with feedback to help make predictions when there is temporal noise in the middle of learned sequences. And we still have a lot more to understand about the cortex.

test … having problems replying … did an Edit :slight_smile:


Probably I’m not explaining very well, sorry for that. I’ve read Numenta papers many times.

Let me define things first :

Current-input : SDR coming at the columns ( size: nbits )
Current-state : all active cells, SDR by flattening TM (size: nrows * nbits)

Current-state represent the current-input in the context of the sequence, it is not representation of the whole sequence so far (as I was saying). It is simply representation of single item of data in context.
(if you vertically aggregate it you will get back the original input).

But if you don’t aggregate it vertically you will have different state-SDRs every time, isn’t it ?
Even though that vertically-aggregated SDR will repeat often.

So my speculation is that non-aggregated state-SDR will be unique every time ?

But if you don’t aggregate it vertically you will have different state-SDRs every time, isn’t it ?

For learned sequences, you will always get the same cells within the columns active.

Take this case: you see the sequence A->B->C->D, followed by random patterns, and then repeated. Each letter represents a set of columns becoming active. After several iterations, some subset of the cells in the A columns will predict a specific subset of cells in the B columns, and these B cells will predict a specific set of cells in the C columns, and so on. So the representation when we see the pattern will be: A->B'->C'->D'. Here A denotes the A columns with all cells active, B' denotes the B columns with a specific cell in each column active, and so on. These representations will be the exact same every time this sequence of patterns is seen.

The TM representation will only be unique (different than any previous representation) when you see a pattern for the first time or when you learn a pattern in a new context.

hmmm…

These representations will be the exact same every time this sequence of patterns is seen.

that’s what I wanted to clear up. From what you are saying : The requirements for TM implementation is that specific representation of A will “prefer/predict” specific representation of B, and so on … i.e. once you hit a specific representation of symbol X, the sequence that follows have a big chance of being very specific, like you said it will be influenced by the frequency of occurrence of those specific transitions.
The only way to “disrupt” the sequence is when new arriving data differ from the prediction.

My next natural question then would be, how TM will handle loops and self reference (if for a minute we imagine there is no incoming data to correct the predicted sequence).

PS> I’m asking all this questions, because i have my own implementation of TM and was wondering what is the expected behavior in more detail.
http://ifni.co/bbHTM.html

Good question. I’m also wondering about this. It would be good to define an easy “testing method”. E.g. Taking a pre-defined data set (e.g. one of the NAB sets), setting HTM parameters (# of columns, boosting, …) to prior agreed values, then the SP+TM algorithm, if properly implemented, should result in outcome X. I’m not sure whether Numenta does sth similar to confirm its own code (changes). If not yet, could we define such a standard test case?

@mraptor Regarding your point that all history is included in the current state: much information is actually in the dendrites and synapses (how many dist dendrites per cell, how many synapses per dendrite, what cells are they connected to, what are their respective permanences). This info is “lost” when taking an SDR snapshot. So I’d say the SDR represents the current state, but alone, e.g. if passed on to a higher level, it does not include history.

1 Like