HTM Mini-Columns into Hexagonal Grids!

Bitking can i clarify something with you.

The first and most important for me atm is how to treat the input that each mini-column / L2/3 cell would receive. In the spatial pooler implementation within a 2D input space (lets say MNIST) each mini column has a receptive field value (as it is an 2D array this is I guess a square around its centre). Within this receptive field (ie offsets from column index etc) a random number of input cells / array indices are chosen to be seen as connected, around 2% or however the original algorithm determines it.

If, as can be done already our spatial pooler outputs a 2D array of bits (i.e an 2D SDR), do you see each L2/3 cell / minicolumn as simply being a single / one to one mapping to the input SDR, or does each L2/3 cell also have a receptive field in which it can connect to multiple inputs.

This ignores the 0.5m dendrite ‘halo’ element of the hex grid activations.

I’m just curious to see what the best and most plausible approach to feeding in the input from the SP to a calvin layer / L2/3 layer would be.

I’ve been looking at: Project : Full-layer V1 using HTM insights to get an idea of the ‘dendritic extent’ as gimery puts it, but wasn’t sure how to handle the actual input part.

Cheers

Repeating a key point: all of this is describing what is happening in the L2/3 layer. At this level all we are working with is patterns and NO predictive memory. The L2/3 is also the layer that talks with other maps though the output axons of the L2/3 cell bodies. Likewise, the sensory inputs and rising axons are projected upwards through all layer terminating in the dense mat of L1.

The projecting axons lateral branches do make connections with inhibitory inter-neurons not shown here.

I am certain that there are connections up and down the minicolumn between the layers but I am not ready to state exactly how these connections work. These connections are key to describing the relationship between pattern recognition and temporal prediction. Working out these rules would be key to understanding the training rules of predictive memory.

Now on to the parts relative to hex-grid theory …

If you look at my entry on the Project : Full-layer V1 using HTM insights post, #34 I put figures on the sizes of the various elements for minicolumn spacing.

Each cell in the mini-column has a few dendrites and each one is at least single SDR and maybe a few.


Referring to this picture from that post - each blue circle is a single mini-column. This is 100 or so cell bodies. Each cell body has it’s own dendrites - say 10 or so for a nice round number.

It turns out that the rising projecting axons are also spaced on this 30 µm spacing so you can assume that each blue circle also has a rising axon bundle.

The large black circle is the reach of the dendrites (+/- 250 µm, or 500 µm total) for the minicolumn in the center of the diagram. That gives the dendrites in each cell in each microcolum access to about 200 or so rising axon clusters. This can be thought of as a “receptive field” of the blue blue mini-columns and rising axons within this black circle for this minicolumn. This is all repeated for the next minicolumn. This is the repeating structure for all minicolumns in the cortex.

In this paper Horizontal Synaptic Connections in Monkey Prefrontal Cortex the lateral connections from the L2/3 cells are given as an average about about 500 µm.


The black beam in this picture is the long distance lateral connection between the two minicolumns so that the two minicolumn “receptive fields” connected by this link of the the hex grid covers the space with very little overlap and very little missed space.

When you factor in the fact that there are several long distance lateral connections emanating in all directions from each cell in the minicolumn you can see that the possibilities to form space covering hex grids is very large. Since these connections from any given cell do not fall on strict angles or length the individual cells can form hex grids with different angle, spacing and phasing.

So recapping, ever minicolumn has 100 cells that each have 10 dendrites, each potentially forming at least one SDR, possibly more. (at least 1000 SDRs per minicolumn)

Each Dendrite, if it went in a straight line from from the cell body passes at least 7 axon projection clusters and with branching probably many more. This means that the area around each minicolum is densely sampled with about 1000 branching dendrites which should end up sampling every rising axon cluster within reach of the dendrites.

The lateral connection link these minicolumns so that if they are responding to a learned pattern, even though it is sampled relatively sparsely, all the space in the resonating hex grid pattern is being sampled and bound together into a single larger unique pattern.

Here is a drawing relating the idealized concept to the messy biological bits.

Also see this post for more on the formation of larger patterns:

2 Likes

To directly answer this key point - the hex grid pattern IS the spatial pooler. I see this as a direct replacement for the spatial pooler in HTM theory.

The point of the pooler is to collect activations and sparsify them; the Hex-grid is a naturally sparse activation.

Hmm I guess I got my assumptions wrong.

My interpretation was that the formation / spacing of the main hex grid forming mini columns was as a result of the 0.5mm dendrite ring or halo, exciting other mini columns on this periphery. If by coincidence 2 or more other mini columns are receiving the some of the input from the same SDR and their dendritic ring happens to overlap, then this would lead to overlap of excitation causing more mini columns to become entrained and activated.

So apologies for the poor drawing, my iPad skills know only so much. So I originally thought as hopefully shown in the bottom part that a mini column,excites its dendrites (black ring) and by luck another mini column around the circumference of this ring is also being activated by the input SDR then it to would activate its dendrite ‘ring’ (red). This would lead to an overlap between the black and red dendrite rings and cause local excitation of other mini columns where this occurs. Leading to these min columns to excite and for the process to repeat and the hex grid pattern to grow as Calvin would say.

At the top is just an example using Gimerys, Hexagonal array layout. In this example because the distance is fixed (i shortened it for ease of drawing) you can get a number of hex grid patterns to emerge, in different orientations, but not scales or possibly phases. It all depends on where along the dendrite circumference the 2nd mini columns happens to be.

Looking over at your post I assume that it is the ‘Lateral axonal projection’ that gives rise to the hexagonal spacing and also is the cause of activation / recruitment of other mini columns?.

My original intention was to follow the previous approach. If I was to attempt to describe it programatically then it might go something like this.

SDR for SP provided as input to L2/3 layer -> 1-1maping of mini columns to SDR, ie each small circle is linked with a single SDR ‘bit’ (though you could have a receptive field if you wanted) -> For each SDR bit that is active you excite that mini column -> for each active mini column in L2/3 layer you go through dendrite ring and excite mini columns on their circumference -> for each active minicoumn in L2/3 layer you also perform local inhibition.

Because unlike the standard SP / TM approach this requires some degree of temporal element i.e repeat this process several times (ie every 10hz equivalent) until it settles on a hex grid.

This was just a very crude first glance. But looks like it is probably inaccurate.

If you have any recommendations or corrections I would appreciate them.

Regards

Momiji

1 Like

I am a bit sorry that my attempts to describe what I am thinking are hard to understand. It all seems so clear and logical in my head but from what you are saying I am doing a terribly job in describing what I am thinking.

I hope that the additional drawings in post #62 above are helpful.

I envision that the competition for for hex-grid formation runs at the gamma rate (40 Hz) and is resolved for each alpha cycle (10 Hz).

The output from this alpha rate update is what is communicated to maps attached by projecting axons from L2/3.

Oh don’t be sorry, lol I would be nowhere without your descriptions and images. It is just myself coming at this with an amalgam of different posts and readings from Calvins book, getting confused. Unfortunately I have not had as long a history in neuroscience to fix in my mind the terms and structures so I can get a bit confused with what I see. In particular what parts are receiving input from say step 1 to what parts are outputting into step 2 so to speak.

Kind Regards

Momiji

1 Like

I have been thinking about Calvin tiles for a very long time; how they might work and the neural substrates that they depend on. I was reviewing some old notes on the Braddick & Sleigh book “Physical and Biological Processing of Images” this evening and found this note.

Title page of the notes with a date in the margin:

And this comment in my notes:

Sorry about the sloppy handwriting - this was me thinking aloud as I read the text.

While those are possible connections they are not the correct distance away.
The lateral connections are usually a fixed distance apart.
You should like them - they are natural triangles.

More details on the geometry here:

1 Like

I’m quite late for the party, but lateral connections are mGluR, right? (efference copies of L6 descending axons)[1]. Those “nasty” synapses are really “interesting”: integrate (sum) over time, are really persistent and only have modulatory effects on proximal dendrites. My understanding is that the drivers (iGluR) are 10x more powerful.

[1] C. C. Lee and S. M. Sherman, “Modulator property of the intrinsic cortical projection from layer 6 to layer 4,” Front. Syst. Neurosci. , vol. 3, no. February, pp. 1–5, 2009.

How about the layer 2/3?
These are the ones that most concern me for grid forming behavior.

Also, do you know how these projection axons interact with the surrounding inhibitory interneurons?
This question is addressed to both layers 2/3 and 6.

1 Like

Those synapses are in L4 PC proximal dendrites. I.e. modulates cortical column “input”. I guess are who incline the balance to who are the inhibition winners. The axons are really dense. According to this review [1] are a big contributor to the column input (at least for V1). Since the shape of the L6 axons is hexagon-to-hexagon, I thought that where your “lateral” inputs. Sorry about that.

[1] T. Binzegger, “A Quantitative Map of the Circuit of Cat Primary Visual Cortex,” J. Neurosci. , vol. 24, no. 39, pp. 8441–8453, 2004.

1 Like

@Bitking Because of conversation elsewhere, I reinvestigated these ideas again, and I again brought it up with Jeff. My first questions were on the L2/3 axonal projections:

These axons have one major bifurcation, where one path goes out of the cortex, so we are only talking about the path that stays in the cortex. The axons do not split up into lots of paths inside the cortex as shown on the right below.

The picture on the left looks right (except for missing the path out of cortex). But I can’t find evidence for the picture on the right. The axon doesn’t split up into a cloud to create this circle of influence. This cortical-cortical axon will create a cluster of synapses about 0.5mm away from the soma, then continue onward in one direction to do it again later. It’s more like a subway line than a hub.

4 Likes

Why are we restricting the path to local operations?
The L2/3 distant reciprocal connections are running in parallel with the local competition. This is part of the H of HTM.

BTW: where did you get the idea that there is a single bifurcation?
I have my research papers to draw on but even a casual google search (pyramidal axonal projections topology; show as image) shows that this is not so:
https://www.researchgate.net/figure/Excitatory-neuron-types-in-layers-2-6-of-the-A-medial-prefrontal-and-B-primary_fig1_322796448

The picture on the left looks right (except for missing the path out of cortex). But I can’t find evidence for the picture on the right. The axon doesn’t split up into a cloud to create this circle of influence. This cortical-cortical axon will create a cluster of synapses about 0.5mm away from the soma, then continue onward in one direction to do it again later. It’s more like a subway line than a hub.

The original Calvin book was my starting place. In the text, he acknowledges that he was interpreting the work of a neurobiologist and that is not his direct specialty. I have invested the time to work through a more detailed version that is more biologically plausible. That said - he offered enough of a starting point that I could see what he was trying to convey.

I have invested a considerable amount of text on how the fuzzy cloud of axonal projections around a given cell establishes enough variation to support learning a range of scaling/rotation/translation possibilities; this is a feature and not a bug.

As far as linear vs triangles; for the self-reinforcement/ inhibition mechanism, the shortest path back to the cell is a triangle. Those three cells and the inhibitory basket cells between form a self-reinforcing group. At each node of this triangle there are many cells that theoretically could respond - what is special about these three is that they are all seeing some part of a pattern that was learned at some point in the past and are the best possible match at this time; they vote to prove this. Even if each and every one did not actually see “this” pattern they are generalizing that this is the best match right now. The joy of a distributed pattern is that this little triangle unit is voting at the same time as tens or hundreds of other units at the same time and bad matches are getting shouted down if the general population thinks that a small subsection is wrong. This is actually very similar to the TBT model.

Last note - keep in mind that the basic HTM mechanism assumes a unit signal per unit time. The hex-grid voting mechanism is actually implemented with a phase/rate mechanism. We already acknowledge that phase is critical to the temporal voting mechanism. Phase + rate is important to the hex-grid formation.

4 Likes

I’m not making any comment on the boundaries of cortical columns here.

Yes, they are all running in parallel, but this is not the hierarchy. How are lateral connections between L2/3 neurons part of the hierarchy?

I got it from Jeff, but now I am questioning it in light of Levitt el al 1993.

He’s OOTO the rest of the week, so this may need to wait until Monday.

5 Likes

Here is a good example showing 3 clusters of horizontal long-distance axonal connections of a L2/3 pyramidal cell:


Source: A biological blueprint for the axons of superficial layer pyramidal cells in cat primary visual cortex | Brain Structure and Function

I note that there are 3 clusters. Coincidence or consequence of Bitking hex-grid theory?

One further question to Bitking: you talk about 0.5mm long-distance connections, but I see more often a distance of several mm in the literature. Is it because the articles focus more on V1 which has different characteristics ? (Gabor filter vs hex-grid ?)

5 Likes

Papers like the ones referenced here?

2 Likes

My perspective on this is that long distance lateral connections (besides being a voting mechanism) result in representations which associate a wider collection of features that just the input contributed by a single column in isolation. Thus, this representation is more abstract, and is the bases of the hierarchy.

I believe there are also other mechanisms (such as TP) happening in this layer, whose purpose is also to assist in forming abstractions (this is the object layer in TBT afterall). But even this mechanism alone should intuitively be forming abstractions.

Considering this increase in level of abstraction, I believe the transition between input and output layers within the same region is actually the logical boundary between hierarchical levels (not the transition from the output layer of one region to the input layer of another region).

7 Likes

Maybe I am according to much importance to a detail, but I am still struggling with the length of long-distance lateral connections of L2/3 pyramidal cells:

  • What “standard” length ?
  • Is there a “standard” length ?

For the first question, I haven’t found the 0.5mm distance in the papers mentioned in the other thread. I think that this number came directly from the book of Calvin where he gives more details but without mentioning clearly the source:

“That 0.5mm mentioned earlier is really as small as 0.4mm (in primary visual cortex of monkeys) or as large as 0.85mm (in sensorimotor cortex)” - The Cerebral Code, Calvin

But the paper you cited in the first post of this thread talk about several millimeters (I have seen this in other papers as well):

“As is the case for other regions of the macaque monkey neocortex,
pyramidal neurons in the supragranular layers of the
dorsolateral prefrontal cortex (PFC) furnish intrinsic axon
collaterals that travel for substantial distances, up to several
millimeters, tangential to the pial surface (Levitt et al., 1993).” - https://www.researchgate.net/publication/12675879_Horizontal_Synaptic_Connections_in_Monkey_Prefrontal_Cortex_An_In_Vitro_Electrophysiological_Study

I think that it matters because that would mean that the long distance lateral connection between two minicolumns would be way bigger than the receptive field size. In other words, there would be considerable missed space: not all inputs could participate in a given grid pattern. Right ?

About the existence of a “standard” length, Calvin did not give his source (or is it hidden in the long exhaustive list at the end?). Have you seen a graph of the distribution of these long-distance lateral connections?

Maybe the grid-forming patterns are stable enough to cope with lateral connections of different distance. I would like to build some simulations to see how it behaves.

2 Likes

I don’t know the definitive resource or answer to your questions.
I have been through all of the references in the Calvin book and he does defend the lengths he references. These are lengths associated with L2/3.

The references in the “VI post” gives a pretty good breakdown and the lengths do have a statistically mean that clusters around this standard length. I see the range of input lengths are useful to allow learning to form grids of a wide variety of spacing, orientation, and phasing.

I have also seen the references that throw in much longer lengths; these seem to go with the lower levels and are not part of the hex-grid forming mechanism. I have often wondered what they do but I don’t have any real theoretical foundation to explain what is going on there. I am open to ideas - no matter how wild.

BTW: my current wild idea (this very little foundation to support it) is that this forms larger grids to discipline the smaller grids to handle communications between scale of representation; sort of a lateral hierarchy. I don’t have a shred of evidence to support this but in long reflections when I am out walking at night this is the only thing I could come up with.

5 Likes