An Apical Depolarization for Numenta: How to Generate the Allocentric Location Signal

Not to rat-hole too much on this issue, but once again, would this not be synaptic depression? My understanding of the terms (as well as the colloquial meaning of the words) makes me think that facilitation would indicate increasing strength, as the effect facilitates the faster production of spikes.

Another drawback is not maintaining a fixed sparsity. When the inhibition step is executed as you described at 1:00:44 the sparsity of the representation is affected. Basically input being pooled which has lots of semantically similar features will result in a more dense representation, whereas an object with many semantically dissimilar features will result in a more sparse representation (or even multiple representations).

I saw a similar behavior in one of my own pooling strategies a while back. At the time I was considering addressing it either with modifications to the SP learning function, or ditching SP entirely and growing proximal connections using an algorithm more like TM (drawback with the latter being that you no longer can leverage the properties of bursting when switching from one object to another).

Woops, I accidentally described synaptic depression. My point was that it’s possible for facilitating cells to need repetitive input before they can fire, because the synapses increase their weight regardless of whether or not the postsynaptic cell fires.
Also, I think the source I gave says that the outputs of L6 CT cells are what facilitate. The inputs to L6 from the thalamus probably depress a little, but L6 CT cells don’t stop firing or slow down very quickly.

actually, i would argue that not maintaining a fixed sparsity is the whole point of the system. indeed, you are right, the SP cannot be used canonically for this system to work. in the code which i will be posting soon, i completely redo the SP, to make it more in the style of TM, as you described. turns out, this actually simplifies the code quite a bit.

on a disconnected but related note, my code does not model columns in either the inference layer or the pooling layer, whatsoever. i do not think these are functionally relevant in the cortex.

1 Like

12 posts were split to a new topic: HTM without using minicolumns

In light of the above conversation, my implementation is actually not in alignment with @dwrrehman’s thought process, so don’t think I’ll post it just yet. My demo is focused on highlighting the deficiencies of the strategy WRT sparsity, but I think now that those deficiencies actually are coming from a conflict with the SP process. I’ll do some more tinkering to adjust to my updated understanding of the process.

Of course, that sounds wonderful- i deeply look forward to your finished demo! :slight_smile:

so “inference layer” is layer 5 and “pooling layer” is layer 4 in neocortex? I mean, the inference must be an active inference on the motor commands? and the pooling must be pooling of sense data for use in the inference?..then layer 2/3 are also input layers to the inference?

not exactly- its a little more complicated than that in the paper, i describe the hypothesized function of every layer that is belelived to exist in a cortical region, but heres a summary/list of all of them, and their functions:

layer 1: N/A
layer 2/3a: Pooling layer
layer 3b-alpha: Inference layer
layer 3b-beta: Pooling layer
layer 4: Inference layer
layer 5a(alpha/beta): Pooling layer
layer 5b-alpha: Inference layer
layer 5b-beta: Pooling layer
layer 6a: Inference layer
layer 6b(alpha/beta): Pooling layer
layer 6b-gamma: Inference layer

you noted input layers and output layers. it is alittle more complicated than what you described, but in terms of anatomical inputs and outputs of the region, these layers classified as the following:

FF = feedforward
FB = feedback

layer 1: FB input layer
layer 2/3a: FF output layer
layer 3b-alpha: FF output layer
layer 3b-beta: FF input layer
layer 4: FF input layer
layer 5a(alpha/beta): FF output layer
layer 5b-alpha: FF output layer
layer 5b-beta: FF input layer
layer 6a: FF input layer
layer 6b(alpha/beta): FB output layer
layer 6b-gamma: FB output layer

there are some details I’m leaving out in terms of the FB system, but conceptually this is correct according to my theory.

i hope that helps!

1 Like

For the more visually-oriented, I drew some pics depicting how this pooling strategy learns to represent an object given a series of feature/location inputs. Active cells are red and predictive are yellow.

For illustrative purposes, I have have only used an object with 3 features and 3 locations. I’ve also aligned related cells to highlight what is going on, but in practice these cell positions would be randomized (and of course there would be a lot more cells involved).

This should give a sense for how the layer converges onto a representation, but let me know if I need to clarify something (or if this indicates gaps in my understanding of the process)

I could have taken this out several more timesteps until it stabilized, but this should be enough to see a representation for the coffee cup emerging.

EDIT - I missed a couple predictive cells in the top row for the last two timesteps, but you get the idea :). I’ll have a scenario like this in a demo app once I finish implementing it.

1 Like

Hey Paul, what exactly are we looking at there? I gather the top row is the pooled object representation layer, and the three smaller populations are feature representations?

Yes, sorry I should have explained that a little better. You could imagine three fingers sensing the coffee cup, and that input represented via the three groups of cells on the bottom. Those are then pooled in the layer above them.

This is intended to depict an overly-simplified representation of the current SMI algorithm, with @dwrrehman’s pooling strategy in place of Numenta’s reset + random SDR strategy for the output layer. I just noticed that I forgot to depict the predictive states in the input layers, so that might have led to part of the confusion.

Note that this pooling strategy (at least in my current understanding of it) requires two or more features to be input at once. This is allows it to make associations between them. For example, you can imagine in steps 7 - 12, two fingers are stationary on the coffee cup and the other finger is exploring it. The learning algorithm applied to the proximal synapses results in associations between the features being formed, eventually resulting in a pooled representation.

Another way to accomplish the required association between features (if you want to have a one-for-one relationship between inference and pooling layers) is for the pooling layer to take proximal input from the inference layer over two timesteps.

Also note that the purpose for showing horzontal rows of many active cells in the first few timesteps was to highlight the fact that it is not the traditional minicolumn bursting. In practice, these cells would be randomized over the layer, but aligning them this way makes it easier to visualize what is going on.

1 Like

Paul,

Did you ever publish your code for the pooling implementation?

I am modifying it, since my initial implementation was defective (due to conflicting logic with SP and minicolumn concepts). I’ll publish the code once I have made the necessary corrections.

I actually think seeing how it fails is very interesting and useful. Building these cognitive architectures requires a robust understanding of how the individual components work and don’t work.

I also did my own drawing of @dwrrehman SMI architecture but broke it down by functionality instead of neurogeography. For those of us who aren’t neuroscientists, it actually helps to see how the pieces work together on a functional level. I’ll post it here when I clean it up.

1 Like

I may go back and finish the demo at some point (I have a hard time making myself finish working on obvious dead-ends, lol).

At a high level, the reason it failed is because I was assuming traditional SP was handling the proximal stream to the pooling layer. This meant that in order for pooling to occur, all the feature/location representations of the object needed significant overlapping bits, so that they activated a significant number of the same columns after SP.

In more “realistic” (presumably) scenarios without a lot of overlap between feature/location representations, you would end up with just a couple active cells after inhibition, and usually with more than one representation for the object (consisting of only a couple cells each). The demo I started was intended to highlight this deficiency.

Perhaps that significant overlap is not sufficiently provided by the proximal stream. It may require its associated inference layer to constantly being feeding predictions into the pooling layer and may use some of the other data streams.

I don’t think feature/location pairs by themselves should have any overlap in general. The relationship comes from their associations through transformations of the object. The inference layer is providing that transformation based on egocentric features which are motor commands as well the built-up model of the object in the pooling layer.

I think this gets into @dwrrehman’s “slow loop” and “fast loop”. It’s a pretty interesting concept and I’d love to see how they differ in practice. But Daniel’s proposed pooling layer implementation probably won’t work in isolation with simple proximal inputs. It needs extra sauce to create that overlap condition.

hi thanks yes that helps thanks so much…but also greedy for more…do you have any neuroscientific references for these layers? If you have I can reference with my own understanding and model…rgds finn

@fine2100 Take a look at the last 5 pages of the paper at the top of this thread for a categorized list of references.

thanks:-)