Exploring the "Repeating Inputs" problem

sheiser1 · February 27, 2019, 8:06pm

I like the mechanism you introduce here and awesome presentation, thanks! I wanted to ask about learning representations in the output layer. In this example we presume to know about ‘ABCD’ and ‘XBCY’ as repeated sequences, but what if we didn’t know to look for these? Or what if another sequence appears like ‘DAYC’, is there a mechanism for forming a new representation bit in the output layer?

It seems to be that with this feature the flexibility of this pooling mechanism could really help out the TM overall, and especially in sequence classification. Great work!

Paul_Lamb · February 27, 2019, 8:51pm

Yes, I have been working on a mechanism which utilizes hex grid formation. I’ll go into it in more depth on a separate thread (didn’t want to muddy the waters on this thread, since the focus here is to explore strategies for addressing the repeating inputs problem)

vpuente · May 21, 2019, 9:48am

A potential solution for this is just sample the burst, i.e. only a random set of the non-predicted column can burst. If the size of that set is bigger than the sparsity, you still will connect the sequence. In my implementation, that approach seems to be working just fine.

Paul_Lamb · May 21, 2019, 11:09am

@vpuente Could you elaborate? I’m not sure what you mean by “sample the burst” in this context.

vpuente · May 21, 2019, 3:13pm

@Paul_Lamb

If the column is not predicted, just flip a random ( with a low sampling rate) to decide if the burst has to be done or not.

something like:

if (randomFloat(1.0f) < sampleRate) // 1>> SampleRate >= sparsity (~0.02)
{
burst();
}

The key is to build the sequence A->A not at once but progressively. If the sample rate is low, the number of cells shared between A and A’ will be pretty high. In a few steps you will be back in A (A->A’->A’'->A) and your fixed signal issue is done. If another symbol appears in the sequence, you will fail. The eager approach has the potential problem of having a waste of synapses/cells for those constant signals. My lazy way is that you will fail at the point that AAAA sequence ends. In general lazy burst seems to be beneficial, in terms of synaptic load.

Paul_Lamb · May 21, 2019, 3:22pm

So if the random flip decides that a burst should not happen, then what happens to the state of cells in that minicolumn? Presumably no cells in that minicolumn become active?

In your example, is A in a specific context (similar to A’ and A’'), or does it mean A without context (i.e. bursting minicolumns)? Sorry if this is a dumb question. I’m having trouble visualizing how this statement follows if you chose to skip some of the minicolumns during the bursting step:

Do you happen to have some code I could look at which uses this strategy?

vpuente · May 21, 2019, 5:03pm

Exactly: you just ignore the burst in that minicolumn. Nothing should happen there this time. Sooner or later the random flip should decide to burst that minicolumn.

The key is to wait multiple times the sequence before to learn the whole context. Somewhat you are building the connections between cells in two values in the sequence, one-by-one not all at once.

My code is really convoluted and nasty to share I can’t even understand it Hope it will change in the future.

In any case, you can test in yours jut by inserting that if in the burst of the minicolumn. It should learn the sequence (a bit slower).

BTW: Coincidentally burst in the biological systems are also stochastically produced (although the meaning is not necessarily the same than in HTM )

Paul_Lamb · May 21, 2019, 5:48pm

Got it. I think this strategy might also lead to ambiguity between the C in ABCD vs the C in XBCY, in particular if they were both learned around the same time (i.e. not thoroughly train on one before training on the other). However, this is a bit tricky to visualize, so I’ll have to experiment to see whether or not this intuition is correct.

vpuente · May 21, 2019, 5:57pm

You are correct. But LTD should fix that later.

Anyway, note that in HTM LTD is not really “bio” faithful. From a biological perspective, you should decrement permanence if presynaptic cell fires and postsynaptic doesn’t. HTM leaves “stale” synapses because you only punish those synapses if the postsynaptic segment reaches the threshold. I’m using a more “bio” way (to do that requires deeper changes in the base architecture of the code).

Paul_Lamb · May 21, 2019, 6:05pm

I will definitely need to test this to get a firmer understanding. Initially, it seems to me that if the representation for C in ABCD contains some of the same cells as the C in XBCY, then as long as both of the sequences reappear often enough that the global decay process doesn’t cause one of them to forgotten, the ambiguous connections for C in both contexts would be reinforced.

sheiser1 · May 21, 2019, 6:11pm

This means that there are different numbers of winnerCells for different bursted inputs right? With 40 active columns, a totally unpredicted input would normally yield 40 bursts and 40 winnerCells chosen.

As I understand the process of:
burst → choose winner cell → build new segment
guarantees that a column caught off guard always learns. With an ignored burst the column doesn’t learn right? Whether or not its costly long term, just curious if I have that correct.

vpuente · May 21, 2019, 6:23pm

You build the new segments with synapses to all the winner cells (one in each bursting minicolumn) if there is no segment with synapses to the previous activations. If there is a “partially” good segment there, you should grow new synapses in it. It’s like creating the connections between presynaptic and postsynaptic cells one by one (not all at once).

sheiser1 · May 21, 2019, 6:55pm

So the all prior winnerCells are included as presynaptic cells on the new segments? This sounds like upping the ‘maxNewSynapseCount’ parameter (or something similar). So instead of all columns bursting and learning on a subset of prior winnerCells, its some columns bursting and learning on the full set of prior winnerCells – correct?

vpuente · May 22, 2019, 7:52am

I think that is not the same. maxNewSynapseCount serves to subsample the previously cells. Here you are also “subsampling” segment creation. I think maxNewSynapseCount models a sort of “spatial sampling” while this is some sort of “temporal sampling”.

helena_Thielen · February 2, 2020, 3:20pm

What is the problem with just using backtracking?
is it just because it is not biologically inspired? or is there any disadvantage when using Backtracking?

Paul_Lamb · February 2, 2020, 3:30pm

IMO, there is nothing in particular wrong with this approach from a practical perspective. I’m mainly interested in exploring problems like this one from different angles and getting other folks’ perspectives. There is a lot of background knowledge in the community here, from neuroscience to computer science to evolution to electrical engineering.

helena_Thielen · February 2, 2020, 3:36pm

I see =) thanks =)

Topic		Replies	Views
My analysis on why Temporal Memory prediction doesn't work on sequential data Numenta Theory sequence-memory	58	7447	February 2, 2020
Suggest change to Sequence Memory algorithm NuPIC sequence-memory	13	902	May 15, 2019
TemporalMemory for prediction Engineering question	35	1765	September 24, 2019
The Static Input Problem Engineering sequence-memory	7	579	January 27, 2023
HTM School 11: Temporal Memory Part 1 YouTube sequence-memory	20	2200	March 7, 2017

Exploring the "Repeating Inputs" problem

Related topics