Why Neurons Have Thousands Of Synapses, A Theory Of Sequence Memory In Neocortex

blue2 · April 6, 2018, 2:07am

nbro · September 21, 2018, 5:39pm

The caption of Figure 6 of the paper mentioned in the title says:

The input stream used for this figure contained high-order sequences mixed with random elements. The maximum possible average prediction accuracy of this data stream is 50%.

@jhawkins, @subutai, I don’t really get why the average prediction accuracy would just be 50%, even in the presence of noise. Btw, the amount of noise added to the sequence is not even specified.

rhyolight · September 21, 2018, 5:43pm

If the max possible prediction accuracy is 50%, doesn’t that mean the stream is 50% noise?

nbro · September 21, 2018, 5:46pm

Even in the presence of much noise, I don’t get why the prediction accuracy couldn’t still be 100%, given that HTM should be, as it’s claimed in the paper, robust to noise, i.e. deal with sequences which contain noise as if they didn’t contain it.

rhyolight · September 21, 2018, 5:48pm

If the input stream is 50% noise, and you are comparing your predictions to the input stream as ground truth, no algorithm will ever get much better than 50% accuracy.

nbro · September 21, 2018, 5:54pm

It’s not written in the paper that the accuracy is calculated in this way you’re suggesting.

Also, in the paper, it’s written

For this simulation we designed the input data stream such that the maximum possible average prediction accuracy is 50%

It should have been stated what actually was done in practice to achieve such a thing.

Also, I really don’t get why would one set up this experiment to obtain just a maximum average prediction accuracy of 50%.

rhyolight · September 21, 2018, 5:55pm

To clarify, HTM will still keep sequences in context in the presence of noise, and continue making predictions given those sequences. It can hold onto those sequences in the noise and continue recognizing features of them over time, even when noise is persistent. Of course, nothing can predict random noise 100%, but HTM can pattern match temporal sequences even with lots of noise.

rhyolight · September 21, 2018, 5:59pm

So the input is not trivial to predict. I’m sure the sequences are very simple patterns. Given a noisy signal, here is our prediction accuracy (near 50%). As we remove cells, accuracy decreases. The point is to show how accuracy is affected by cell death, which is fault tolerance.

nbro · September 21, 2018, 6:06pm

This is not demonstrated in the paper.

nbro · September 21, 2018, 6:11pm

This experiment to demonstrate the ability of the HTM sequence memory to perform predictions is, at least, a little suspicious, also because artificial simple sequences were used. Anyway, it somehow shows that it can cope with cell death.

Paul_Lamb · September 21, 2018, 6:24pm

I think if you understand the HTM algorithm, activity->prediction is no different in a simple sequence compared to a complex one. The connections formed during learning represent a transition from one element to another, regardless of the overall length or complexity of the sequence. The test shows how that transition is impacted by noise in the system.

nbro · September 21, 2018, 6:26pm

As far as I understood, the prediction accuracy of the HTM sequence memory particularly depends on the number of cells per column and on the complexity of the input sequences.

Paul_Lamb · September 21, 2018, 6:27pm

Number of cells in the column yes, but not complexity of the sequence. Number of cells impacts capacity (i.e. how many contexts a particular input can be in)

nbro · September 21, 2018, 6:27pm

Actually, also on the complexity of the sequence. More specifically, it depends on the contextual Markovian order of the elements in the sequence. I suppose that you can cope with this by changing the number of cells per mini-column. Maybe you meant another type of complexity other than the dependencies between the elements in the sequence?

sheiser1 · September 21, 2018, 7:26pm

Upping the order of the sequences (like going to ‘a,b,c,d,e,f’ from ‘a,b,c’) increases the req’d capacity, though the standard 32 per column can potentially handle quite a lot. I think it will inevitably take more iterations to learn long sequences though regardless of the cells per column, since longer sequences mean more context.

Paul_Lamb · September 21, 2018, 7:31pm

Sorry, don’t really understand this point, but…

The capacity of a typically sized 2048 minicolumn, 32 cells each is ridiculously huge. With the learning params set properly, even a sequence of ungodly length can be learned in a single shot.

Paul_Lamb · September 21, 2018, 7:44pm

To further hammer home my point, with the learning params set low, if it were to take 3 itterations to learn a very short sequence, it would likewise take 3 iterations to learn an enormously long one. The learning rule is done locally to remember the transition between two elements in context. As such, it functions the same way regardless of how long the sequence is.

I can draw up some visualizations to further explain if this point sounds wrong to you.

subutai · September 21, 2018, 7:57pm

In this simulation, the stream contained sequences that were each 6 elements long, followed by 4 noise elements. e.g. “XABCDENNNNYABCFGNNNNXABCDENNNN…” (where N is a completely random input).

In this setup the four noise elements are unpredictable as is the first element of each sequence, so the best you can do is predict 50% of the elements.

The full code for the experiments is here:

nbro · September 21, 2018, 8:48pm

I’m still not sure what mathematical property lets you say that the best you can do is 50%, also because the noise is inserted regularly between the sequences (i.e. at regular intervals, given that the noisy sequences have all the same length, as well as the “normal” sequences) and, after a while, the algorithm should have learned all the sequences.

Btw, how exactly did you measure the prediction accuracy?

rhyolight · September 21, 2018, 9:03pm

You know where to look to find this, Subutai just gave you the link.

github.com

numenta/htmresearch/blob/master/projects/sequence_learning/sequence_simulations.py#L122


      
                  nv.add(numpy.random.randint(n))
              noisyVecs.append(nv)
          
            return noisyVecs
          
          
          #########################################################################
          #
          # Core experiment routines
          
          def computePredictionAccuracy(pac, pic):
            """
            Given a temporal memory instance return the prediction accuracy. The accuracy
            is computed as 1 - (#correctly predicted cols / # predicted cols). The
            accuracy is 0 if there were no predicted columns.
            """
            pcols = float(pac + pic)
            if pcols == 0:
              return 0.0
            else:
              return (pac / pcols)

Topic		Replies	Views
A book about dendritic computation and cortical layers Tangential Theories	18	1935	October 3, 2017
Hypothesis for distal apical dendrite (also sensory-motor inference) Tangential Theories	2	863	June 22, 2017
Hebbian Learning Numenta Theory visualization , hebbian	29	4557	February 3, 2017
Hebbian learning is shown to work through the number, not size of synapses General Neuroscience	10	695	December 20, 2020
A Framework For Cortical Computation Poster - Questions Numenta Theory	1	551	November 13, 2018

Why Neurons Have Thousands Of Synapses, A Theory Of Sequence Memory In Neocortex

Related Topics