HTM Mini-Columns into Hexagonal Grids!

Agreed. I just pointed that out because it looked like that in some figures in the paper shown in the recent hackers’ hangout.

Most layers receive most of their input from themselves. There could be resonance in any layer. Hex grids are a replacement for spatial pooling, which determines minicolumn states. Minicolumns exist in most layers*, so why assume hex grids are only formed in one layer?
*There’s decent evidence for minicolumns in L4, L5 TT, L5 ST, and L6a CT.

I’m not sure L2/3 is suited for resonance since its firing is sparse compared to other layers, meaning a small fraction of cells have much higher firing rates than the other cells at a given point in time (during a whisker deflection). L5 TT cells might be more suited for resonance because they have generally high firing rates. I imagine that’s more suited for network activity settling from one state into a more hex grid -like state. I don’t know though.

I’m not saying that L2/3 can’t be where hex grids form, just that I don’t see why it has to be L2/3. For example, L5 slender tufted cells are similar to L2/3 in several ways*, and they are more directly related to the thalamus** and so are more suited for utilizing cortex-thalamus resonance. They also have higher firing rates which might make them more suited for resonance. They don’t have a place in HTM theory unlike L2/3, so assigning the role of hex grids to L5 ST cells is more compatible with HTM theory.

*They are suited for voting, e.g. long lateral connections and projections to other regions.

**reasons

They receive input from higher order matrix thalamus, and they receive input from one group of L6 corticothalamic cells just like L4 receives input from another group of L6 CT cells. Each of those groups of L6 CT cells forms their apical tufts mainly in the corresponding layer, either L4 or L5st (L5a in this region). L2 and L3 project to L5a (slender tufted cells in this case) way more than to L5b, and L5a projects to L2 a lot.
(Barrel cortex.)

I hadn’t thought about learning. That seems like it could clean up the messiness caused by messy axon arbors and whatnot into something more neatly gridy.

I’m not sure maintaining topology is enough to convey a hex grid pattern. For example, if the topological axons spread out too much but still maintain a blurred topology, that might not be spatially precise enough for hex grids. It seems like it would be really hard to find strong evidence for this.

I don’t understand what you think L5 does in hex grids. Are you talking about L5 slender tufted cells or thick tufted cells?

That is not what I usually come across. If you have more info about this, I’ll take it.

I aint too much concerned about minicolumns when envisionning L2/3 or calvin, to be honnest.

On the other side,

I was looking for some support about those. Please share if you know of something, in the HTM acception of them sharing proximal input. (Since from a developmental (and structural) perspective, I believe they are a thing indeed, spanning across all layers).

Another specificity of L2/3 is proximity with L1, giving L2 cells the ability to tap into L1 without an apical trunk. Maybe there’s something to that

Cheers

I fully accept that predictive cell thing as it solves several important holes in the theory of operation that I have been trying to piece together since I was exposed to Calvin many years ago.

How to combine predictive cells and grid forming cells so they work together?

In most of the cortex wiring diagrams that I have seen there is no direct path from L5 (predictive cells) to L2/3 (grid forming cells). I can’t help but notice that there is a projection from L5 to the thalamus, and there are projections from the thalamus to L2/3 that could serve link in the output of the predictive layer. I don’t have a firm wiring diagram on how that could happen but it is a strong candidate and sussing out this possible connection is in the rather largish stack of things to look into in depth.

So little time, so many papers.

That is exactly why I favor L2/3, that, and the map-2-map connections originating in L2/3.

1 Like

I agree in principle - lateral connections are present all over the cortex.There are also inhibitory inter-neurons all over the cortical layers. My first take would be to establish a bias point where the cells operate efficiently.

A little further down the road is considering that there are possible large scale voting structures being formed on all layers but only L2/3 communicates between maps. I am unable to wrap my head around what kind of computing structures could emerge from this configuration. It seems like they would be constantly fighting each other and I am unable to establish how training might progress.Then again, considering the huge difference in axonal projection scale in different layers - it could offer some sort of connection between scales of representation. If someone could explain how such a thing might work I would be open to listening.

I really should have read about that a bit before I wrote it. I’m not sure it’s true. Here’s the only study I know of. It’s about inputs to L6 CT cells.

Photostimulation studies wouldn’t show distal inputs very well.

I assume lateral inputs are a large source of input in most layers because most layers form long lateral axons in themselves.

I can’t say for certain whether they share proximal input, but these sources are related to that i.e. not just anatomical.

L5st and L5tt
https://science.sciencemag.org/content/358/6363/610.full

L6, and cited info about L4

If you haven’t, it might be worth making sure they don’t have apical trunks. One the other side of the sheet, in lower L6, apical dendrites often turn to the side and pretty much any direction. In L6 they’re called things like inverted, modified, horizontal, and tangential.

2 Likes

Are you referring to the idea that L5 fires predictively? I think there are a lot of different types of prediction, like temporal memory and possible objects (predictions of things on an object you haven’t fully sensed.) Maybe there’s some form of prediction that fits hex grids well. Maybe that’s temporal memory.

L5 slender tufted cells project to L2/3 a lot. L5 thick tufted cells do not project there much at all, and they project to the thalamus.
You could read about the core-matrix theory of thalamus if you haven’t. The projections from thalamus to L2/3/other layers others are different from the ones to L4/others. For example, they don’t respect hierarchy I believe. Dunno if that’s relevant.

In barrel cortex, L5tt → thalamus → L2/3 and L5st. L5 slender tufted cells are similar to L2/3 like that and other ways. I think thalamus-> L5st might not exist elsewhere though, and barrel cortex is two levels of hierarchy interdigitated as far as I can tell, so maybe that’s not a loop with thalamus.

2 Likes

Thanks for those papers, I’ll have a look at them.
In the meantime, wanted to address this. I’m oblivious to most electrochemical explanations, but I’ve formed the impression that most of those signalling specificities (NMDA spiking on distal parts, apical trunk “channel” etc) are first and foremost structural. For example, L4 spiny stellates are thought to be developmentally PCs with shrinked apical parts, and now each of their dendrites (even those “at the top”) behave kinda the same… it is with that understanding that I see L2 as not much more complex than stellates.

Yeah, there’s a lot of jargon that could be simplified into HTM-like concepts. The different parts of the dendrite do play different parts in integrating synaptic inputs though.

This article is relevant. L2 neurons at the top of L2 can have apical dendrites turned to the side, and some don’t appear to have an apical dendrite.

Sometimes L4 cells keep their apical dendrites, especially in cat V1 I recall. I don’t know if they do anything. In other layers, sometimes they don’t bother getting rid of their apical dendrite. They just lose most of their apical tuft.
I hate developmental remnants. L6b might be one, so I have no idea whether it does anything useful or is just there to waste time.

2 Likes

Yes still, of course. NMDA towards the tip and all. But no complex Ca2+ gating or whatever it is that makes apical trunks a pain to model. That at least was my take.

Yup not all L4 are stellates.

About L6 “degenerate” apicals : in a recent message to bitking, I hypothetised that this would ideally place them to apically sense a single L5 minicolumn “bursting” in the HTM sense, ready to send such novelty/surprise signal to Thalamus (for attention related mechanisms to kick in, hopefully)

Yeah, I think there’s none of that.

In mice, often the degenerate apicals of L6 CT cells don’t even shrink past L2/3. Maybe some still have the right positioning.

Lower L6 has well developed apical tufts in L5a. L5a forms broad axon arbors in L2/3, and they receive input from part of the thalamus which generally requires pretty large whisker deflections to fire so maybe that’s related to surprise too. Except in the other direction, from thalamus to cortex.

2 Likes

I totally get your asking for clarification on which cells of the zoo of cells I am drawing from for my proposal - to be truthful I am also sifting through a good sized stack of papers trying to work this out for myself.

The shape of my search is a core of function that I have derived mainly from a top-down perspective and a general framework implementation of both the global workspace theory and Calvin tile signaling. Starting from this point I have been fitting that to the hardware a bit at a time. It may turn out to be hagwash but I am encouraged that when I chase some lines down the hardware I need it does exist although several time I have have to revise how it works as the details differ from my assumptions.

Let me ask this back: have you been able to fit HTM theory to the hardware? I mean a tight fit where you can describe the roles of what excitatory cells and inhibitory interneurons do exactly what?

2 Likes

Another thing for L2/3 is the fact that they alone seem to operate at the gamma rhythm.
Seems to me that sustained resonance shall occur when communicating rather “quickly”; so that state from prior “tick” is recent enough to be integrated by the next.
(no support for this thought… simply brainstorming here)

4 Likes

Excitatory cells maybe a little, and interneurons not whatsoever.

1 Like

I have a pretty good idea about the L2/3 interneurons; chandeliers on the axon hillock, and projecting stubs hitting the basket cells as they pass through on the way to the reciprocal connections with other L2/3 cells.

Anything you can add about the Hawkins predictive cell and how that works with the Calvin grid forming cells would be useful.

Wall of ramble incoming. I don’t have any conclusive thoughts but I’ll share some trains of thoughts.

I imagine hex grids would complement other things well. For example, voting. I think somewhere there would be voting like it happens in HTM, and hex grids add a consistency to the subnetworks - pushing them all far towards hex grids which overlap less than spaghetti - which helps distinguish the subnetworks/SDRs representing each possible object and in doing so adds more voting.

This doesn’t have much to do with predictive firing but it’s related to prediction.
Maybe hex grids are entirely distal, i.e. they don’t determine minicolumn states or single-fire states. Instead, maybe they determine predictive states. Let’s say that cells which were predicted by distal apical input fire bursts or otherwise fire more in some other suitable way. Martinotti cells have strongly facilitating synapses, so they sort of detect bursts although they may facilitate over longer periods of time. When they receive burst input from just one or two cells, they fire and inhibit most of the distal apical dendrites in the cortical column. That reduces bursting. Overall, it’s like another layer of SDR built on top the single-fire states. Bursts compete through lateral competition, so maybe they are capable of hex grids.

I’ll think about predictive firing and hex grids more in responses to some other things.

You listed other functions for hex grids, like temporal pooling. Maybe those functions are split somewhat into different layers, rather than hex grids doing voting in every layer.

What sorts of holes does that solve?

If the prediction info passes through the thalamus, then it needs whatever the thalamus does, so things like CT feedback from L6, tonic/burst modes, and inhibition from the thalamic reticular nucleus (which has tonic/burst mode with especially long bursts that can can activate GABAB receptors in thalamic relay cells, inhibiting them for a while and thereby putting them into burst mode.)

Another thing about the thalamus is almost all info reaches the cortex via the thalamus, and the main cortical output (L5 thick tufted cells) projects to the thalamus in parallel with projecting to other subcortical structures. Those L5 TT cells project don’t project much within the cortex, except to each other, so maybe their signals are subcortex-like and need to pass through the thalamus before the cortex can use them. Maybe there’s something like a language barrier between the subcortex and cortex which the thalamus resolves. You could replace language with whatever you want. Just something that prevents the cortex (minus L5tt) from communicating directly with the subcortex/L5tt.

The barrier is probably something unique to the cortex which subcortical structures lack, not something the subcortex has which the cortex lacks, since subcortical structures are varied. I’ve gotten the impression or very vaguely remember that parts of the thalamus project to other subcortical structures and those thalamic nuclei are non-specific, projecting broadly and related to oscillations or brain states. Maybe that difference from thalamic nuclei which project to the cortex could help figure things out. For example, maybe it’s about attention filtering because broad projections aren’t suited for focusing on a specific part of the sensor.

It might have something to do with oscillations since the thalamus is involved in generating oscillations (the TRN generates spindle oscillations on its own). Oscillations might have something to do with grid cells. Hex grids might have something to do with grid cells too. The prediction signal from L5tt might have something to do with behavior and/or displacements. So maybe the prediction signal passes through the thalamus as part of grid fields. Grid fields are in space, not the cortical sheet, so they wouldn’t form in L2/3. There’d be gridiness with respect to the cortical sheet and information related to location, but not yet converted into grid fields. Grid fields might form in L5st cells since L2/3 projects to them a lot. Based on axon arbors, L2/3 appears to mainly target L5st cells rather than L5tt cells.

L5 corticothalamic (thick tufted, L5tt) cells are also the motor output, so perhaps they can’t always produce predictions for the purposes of perception or else they would constantly send motor commands. Instead, maybe L5 slender tufted (L5st) cells are the predictive ones.

L5st cells receive input from higher order thalamus (to which L5 corticothalamic cells project.) They also bias L2/3 in a way possibly related to behavior and location*. So maybe they receive something from the thalamus which conceptually combines behavior and whatever form of abstraction hierarchy performs. Behavior and abstraction are both closely related to prediction (e.g. sequences of behaviors, predicting results of behaviors, and predicting the rest of a set of features to recognize an object).

*They might depolarize cells in L2/3 such that each L2/3 cell is depolarized most greatly at a specific phase of the whisking cycle, which you can think of like an angle **. I might be remembering wrong. That’s related to location because it would make cells in L2/3 more responsive to sensory stimuli at a particular part of the whisking cycle. You’ve said grid cells respond at grids of locations because of hex grids. For that to happen, it’d have to be somewhat indirect because grid cells respond to grids on the landspace, not on the cortical sheet. Maybe it could help to think of hex grids as a way to separate subnetworks. Location is continuous but grid fields are not, so maybe it’s some sort of quantization. In the EC, there’s a gradient of grid field spatial intervals. Maybe there’s a gradient of hex grid intervals, and all of them continuously integrate a movement signal. Normally, that wouldn’t do much since they quantize it. They’d say location 2 + displacement of .1 = location of 2.1 rounded is still 2. But with a gradient of hex grid intervals, there’s always one hex grid changing to another state, so they could interact with other hex grids (i.e. parts of EC for other hex grid intervals) to change their quantized state when they should.

**It’s not an angle because the whiskers oscillate around a variable midpoint with variable amplitude, but that’s beside the point.

2 Likes

By far and away - the biggest one for me is transitions. As far back 1918, in The Principles of Psychology, Volume 1 (of 2) by William James, the transition from one state to another was under consideration.
Evolution%20of%20thought
Calvin tiles do a good job of representing a single state - a frozen moment in time. How do they go from this moment to the next one? If we are able to hold a lifetime of experience in the high-dimensional manifolds of map-2-map joining of tiles, what guides us from this state to the next? There has to be a way of representing transitions in representations.

As I reflected on this I remembered reading On Intelligence and the whole predictive thing that I saw as the central thrust of the book - this fit the bill perfectly.

If you have the time, these books are an excellent read; William James got a lot of things right.

@Casey The next one is the detection of surprise. The constant comparison of internal state with external state and signalling when something is deviating from our predictions does not fit very well with tiles.They would just keep snapping to whatever learned state is the closest to the sensed states. The predictive/bursting method is the perfect signal back to the thalamus to gate signal to the subcortical structures that something odd is happening. Note that the “everything is cool” signal (single cells sparsely distributed) is also the lowest energy state which is absolutely what we want in term of efficiency; the cortical fabric is only activated when something unexpected is happening and requires attention. This also then gates more signal to the cortex, making it more active and burning more energy when it is needed.

That attention may just be learning.

It is hard to imagine a better driver for the “tonic mode” vs “burst mode” in the thalamus than surprise bursting in the predictive mechanism. As you indicated, this signal is then passed on to the lizard brain for consideration. The lizard does not care about the fancy coding of hex-grids, it wants to know that a bug is crawling on us right now.

Conversely - a patch of tonic mode signals that whatever is happening on the corresponding hex-grid patch should be a stable representation - sort of a “good data” signal.

2 Likes

Maybe both these things are happening in different layers. I like the idea of hex grids being formed in one layer and the minicolumns spanning all layers. This is like another information channel between where one layer can transmit a state to other layers via minicolumn activations.

I think it could be happening in another layer.

3 Likes

Even better, consider the minicolumns as a substrate over which many communications are happening at once… once layer might do SP to activate minicolumns, while another is activated via hex-grids? Just spit-balling.

1 Like

So close …
one layer might do TM to activate minicolumns, while another is activated via hex-grid to do SP!

2 Likes