HTM Mini-Columns into Hexagonal Grids!

Hmm I guess I got my assumptions wrong.

My interpretation was that the formation / spacing of the main hex grid forming mini columns was as a result of the 0.5mm dendrite ring or halo, exciting other mini columns on this periphery. If by coincidence 2 or more other mini columns are receiving the some of the input from the same SDR and their dendritic ring happens to overlap, then this would lead to overlap of excitation causing more mini columns to become entrained and activated.

So apologies for the poor drawing, my iPad skills know only so much. So I originally thought as hopefully shown in the bottom part that a mini column,excites its dendrites (black ring) and by luck another mini column around the circumference of this ring is also being activated by the input SDR then it to would activate its dendrite ‘ring’ (red). This would lead to an overlap between the black and red dendrite rings and cause local excitation of other mini columns where this occurs. Leading to these min columns to excite and for the process to repeat and the hex grid pattern to grow as Calvin would say.

At the top is just an example using Gimerys, Hexagonal array layout. In this example because the distance is fixed (i shortened it for ease of drawing) you can get a number of hex grid patterns to emerge, in different orientations, but not scales or possibly phases. It all depends on where along the dendrite circumference the 2nd mini columns happens to be.

Looking over at your post I assume that it is the ‘Lateral axonal projection’ that gives rise to the hexagonal spacing and also is the cause of activation / recruitment of other mini columns?.

My original intention was to follow the previous approach. If I was to attempt to describe it programatically then it might go something like this.

SDR for SP provided as input to L2/3 layer -> 1-1maping of mini columns to SDR, ie each small circle is linked with a single SDR ‘bit’ (though you could have a receptive field if you wanted) -> For each SDR bit that is active you excite that mini column -> for each active mini column in L2/3 layer you go through dendrite ring and excite mini columns on their circumference -> for each active minicoumn in L2/3 layer you also perform local inhibition.

Because unlike the standard SP / TM approach this requires some degree of temporal element i.e repeat this process several times (ie every 10hz equivalent) until it settles on a hex grid.

This was just a very crude first glance. But looks like it is probably inaccurate.

If you have any recommendations or corrections I would appreciate them.



1 Like

I am a bit sorry that my attempts to describe what I am thinking are hard to understand. It all seems so clear and logical in my head but from what you are saying I am doing a terribly job in describing what I am thinking.

I hope that the additional drawings in post #62 above are helpful.

I envision that the competition for for hex-grid formation runs at the gamma rate (40 Hz) and is resolved for each alpha cycle (10 Hz).

The output from this alpha rate update is what is communicated to maps attached by projecting axons from L2/3.

Oh don’t be sorry, lol I would be nowhere without your descriptions and images. It is just myself coming at this with an amalgam of different posts and readings from Calvins book, getting confused. Unfortunately I have not had as long a history in neuroscience to fix in my mind the terms and structures so I can get a bit confused with what I see. In particular what parts are receiving input from say step 1 to what parts are outputting into step 2 so to speak.

Kind Regards


1 Like

I have been thinking about Calvin tiles for a very long time; how they might work and the neural substrates that they depend on. I was reviewing some old notes on the Braddick & Sleigh book “Physical and Biological Processing of Images” this evening and found this note.

Title page of the notes with a date in the margin:

And this comment in my notes:

Sorry about the sloppy handwriting - this was me thinking aloud as I read the text.

While those are possible connections they are not the correct distance away.
The lateral connections are usually a fixed distance apart.
You should like them - they are natural triangles.

More details on the geometry here:

1 Like

I’m quite late for the party, but lateral connections are mGluR, right? (efference copies of L6 descending axons)[1]. Those “nasty” synapses are really “interesting”: integrate (sum) over time, are really persistent and only have modulatory effects on proximal dendrites. My understanding is that the drivers (iGluR) are 10x more powerful.

[1] C. C. Lee and S. M. Sherman, “Modulator property of the intrinsic cortical projection from layer 6 to layer 4,” Front. Syst. Neurosci. , vol. 3, no. February, pp. 1–5, 2009.

How about the layer 2/3?
These are the ones that most concern me for grid forming behavior.

Also, do you know how these projection axons interact with the surrounding inhibitory interneurons?
This question is addressed to both layers 2/3 and 6.

1 Like

Those synapses are in L4 PC proximal dendrites. I.e. modulates cortical column “input”. I guess are who incline the balance to who are the inhibition winners. The axons are really dense. According to this review [1] are a big contributor to the column input (at least for V1). Since the shape of the L6 axons is hexagon-to-hexagon, I thought that where your “lateral” inputs. Sorry about that.

[1] T. Binzegger, “A Quantitative Map of the Circuit of Cat Primary Visual Cortex,” J. Neurosci. , vol. 24, no. 39, pp. 8441–8453, 2004.

1 Like

@Bitking Because of conversation elsewhere, I reinvestigated these ideas again, and I again brought it up with Jeff. My first questions were on the L2/3 axonal projections:

These axons have one major bifurcation, where one path goes out of the cortex, so we are only talking about the path that stays in the cortex. The axons do not split up into lots of paths inside the cortex as shown on the right below.

The picture on the left looks right (except for missing the path out of cortex). But I can’t find evidence for the picture on the right. The axon doesn’t split up into a cloud to create this circle of influence. This cortical-cortical axon will create a cluster of synapses about 0.5mm away from the soma, then continue onward in one direction to do it again later. It’s more like a subway line than a hub.


Why are we restricting the path to local operations?
The L2/3 distant reciprocal connections are running in parallel with the local competition. This is part of the H of HTM.

BTW: where did you get the idea that there is a single bifurcation?
I have my research papers to draw on but even a casual google search (pyramidal axonal projections topology; show as image) shows that this is not so:

The picture on the left looks right (except for missing the path out of cortex). But I can’t find evidence for the picture on the right. The axon doesn’t split up into a cloud to create this circle of influence. This cortical-cortical axon will create a cluster of synapses about 0.5mm away from the soma, then continue onward in one direction to do it again later. It’s more like a subway line than a hub.

The original Calvin book was my starting place. In the text, he acknowledges that he was interpreting the work of a neurobiologist and that is not his direct specialty. I have invested the time to work through a more detailed version that is more biologically plausible. That said - he offered enough of a starting point that I could see what he was trying to convey.

I have invested a considerable amount of text on how the fuzzy cloud of axonal projections around a given cell establishes enough variation to support learning a range of scaling/rotation/translation possibilities; this is a feature and not a bug.

As far as linear vs triangles; for the self-reinforcement/ inhibition mechanism, the shortest path back to the cell is a triangle. Those three cells and the inhibitory basket cells between form a self-reinforcing group. At each node of this triangle there are many cells that theoretically could respond - what is special about these three is that they are all seeing some part of a pattern that was learned at some point in the past and are the best possible match at this time; they vote to prove this. Even if each and every one did not actually see “this” pattern they are generalizing that this is the best match right now. The joy of a distributed pattern is that this little triangle unit is voting at the same time as tens or hundreds of other units at the same time and bad matches are getting shouted down if the general population thinks that a small subsection is wrong. This is actually very similar to the TBT model.

Last note - keep in mind that the basic HTM mechanism assumes a unit signal per unit time. The hex-grid voting mechanism is actually implemented with a phase/rate mechanism. We already acknowledge that phase is critical to the temporal voting mechanism. Phase + rate is important to the hex-grid formation.


I’m not making any comment on the boundaries of cortical columns here.

Yes, they are all running in parallel, but this is not the hierarchy. How are lateral connections between L2/3 neurons part of the hierarchy?

I got it from Jeff, but now I am questioning it in light of Levitt el al 1993.

He’s OOTO the rest of the week, so this may need to wait until Monday.


Here is a good example showing 3 clusters of horizontal long-distance axonal connections of a L2/3 pyramidal cell:


I note that there are 3 clusters. Coincidence or consequence of Bitking hex-grid theory?

One further question to Bitking: you talk about 0.5mm long-distance connections, but I see more often a distance of several mm in the literature. Is it because the articles focus more on V1 which has different characteristics ? (Gabor filter vs hex-grid ?)


Papers like the ones referenced here?


My perspective on this is that long distance lateral connections (besides being a voting mechanism) result in representations which associate a wider collection of features that just the input contributed by a single column in isolation. Thus, this representation is more abstract, and is the bases of the hierarchy.

I believe there are also other mechanisms (such as TP) happening in this layer, whose purpose is also to assist in forming abstractions (this is the object layer in TBT afterall). But even this mechanism alone should intuitively be forming abstractions.

Considering this increase in level of abstraction, I believe the transition between input and output layers within the same region is actually the logical boundary between hierarchical levels (not the transition from the output layer of one region to the input layer of another region).


Maybe I am according to much importance to a detail, but I am still struggling with the length of long-distance lateral connections of L2/3 pyramidal cells:

  • What “standard” length ?
  • Is there a “standard” length ?

For the first question, I haven’t found the 0.5mm distance in the papers mentioned in the other thread. I think that this number came directly from the book of Calvin where he gives more details but without mentioning clearly the source:

“That 0.5mm mentioned earlier is really as small as 0.4mm (in primary visual cortex of monkeys) or as large as 0.85mm (in sensorimotor cortex)” - The Cerebral Code, Calvin

But the paper you cited in the first post of this thread talk about several millimeters (I have seen this in other papers as well):

“As is the case for other regions of the macaque monkey neocortex,
pyramidal neurons in the supragranular layers of the
dorsolateral prefrontal cortex (PFC) furnish intrinsic axon
collaterals that travel for substantial distances, up to several
millimeters, tangential to the pial surface (Levitt et al., 1993).” -

I think that it matters because that would mean that the long distance lateral connection between two minicolumns would be way bigger than the receptive field size. In other words, there would be considerable missed space: not all inputs could participate in a given grid pattern. Right ?

About the existence of a “standard” length, Calvin did not give his source (or is it hidden in the long exhaustive list at the end?). Have you seen a graph of the distribution of these long-distance lateral connections?

Maybe the grid-forming patterns are stable enough to cope with lateral connections of different distance. I would like to build some simulations to see how it behaves.


I don’t know the definitive resource or answer to your questions.
I have been through all of the references in the Calvin book and he does defend the lengths he references. These are lengths associated with L2/3.

The references in the “VI post” gives a pretty good breakdown and the lengths do have a statistically mean that clusters around this standard length. I see the range of input lengths are useful to allow learning to form grids of a wide variety of spacing, orientation, and phasing.

I have also seen the references that throw in much longer lengths; these seem to go with the lower levels and are not part of the hex-grid forming mechanism. I have often wondered what they do but I don’t have any real theoretical foundation to explain what is going on there. I am open to ideas - no matter how wild.

BTW: my current wild idea (this very little foundation to support it) is that this forms larger grids to discipline the smaller grids to handle communications between scale of representation; sort of a lateral hierarchy. I don’t have a shred of evidence to support this but in long reflections when I am out walking at night this is the only thing I could come up with.


I believe one fundamental characteristic of Calvin-like grid formation is the part played by the effective length of inhibitory diffusion around currently activated spots.
If you assume each neuron in L2/3 is on the verge of spiking at each (otherwise mediated ?) gamma-tick, then the resulting grid is a tightest packing of marbles on a plane, aka, a hex-grid, where the minimum length depends mostly on inhibition radius. In that sense, “longer” axonal ranges (I have repeatedly come across figures in the range of 3 mm in diameter for lateral axonal arbors of L2/3) does not impact the tightness of the packing per se, only the range at which one cell can directly recruit and attract others.


Another point, after reading Bitking’s introductory post again :

For it to really work, implementing voting+sparsity(SP)+economy, and Calvin-like while explaining the report of a 0.5mm spacing, considering the 1.5mm radius of axonal arbors, I’d go with the hybrid model (surround inhibition + reverberating), with a twist :

  • strong inhibition is becoming dominant only when there’s actually some uncertainty to resolve. This ensures that even when uncertain (voting), or (maybe?) learning something entirely new, the activation pattern will converge, eventually, to some 0.5mm-spaced grid.
  • when the input is straightforward (or straightforward given context), then only the obvious spots come to be activated in the first place. So the overall signal isn’t very loud, and doesn’t incur so much inhibitory effort… and thus, in cruise-mode, the grid forming (spreading) can work by Calvin resonance only, following those synapses formed by the already-known grid which is now replayed.

Is this an intuition you have, or do you see similarities with mathematical or geometrical transformations?

Also, I heard @jhawkins talk more than once about ideas how different phases have roles in learning and operation of neurons clusters, but never in great detail. Does anyone know more about this? If I understand this correctly, it is still on topic.

1 Like