How HTM is supposed to deal with spatial invariance?

Spin,

The entorhinal cortex is cortex. One of the takeaway lessons I have gotten from books like “On Intelligence” is that cortex is cortex. The distributed grid representation is clearly spread over a very large chunk of the entorhinal cortex. Why should this computing pattern not be found elsewhere in the cortex?

The older allocortex in the hippocampus communicates with the neocortex and distills this representation into chunks like “places.” That strongly suggests that the place information is in the cortex distributed over this grid pattern to be sampled. Hence the claim of a distributed representation in the grid cells. SDRs are perfect to get a little taste of several different intermingled distributed representations from many sensory modalities and combine them into a meaning.

I have many reasons think that there are other codings like “place” in the hippocampus but that they are not as easy to stumble into by monitoring free-roaming critters. It may be a real stretch to suggest that other parts of the brain use a grid pattern but consider this: we were able to sample grid cells because there was a strong relationship to something happening in the cage to correlate with. (See reference [7] above) I don’t think I have seen anyone looking for this sort of property in other parts of the cortex.

I find distributed representation in grid patterns to be a delightful solution to the coding problem. Even if this turns out to be non-biological for most of the cortex it codes very well in neural networks. The ratslam people are doing very interesting things with this. In the H of HTM, we need to get from a point in the input pattern space and spread it out to interact with other parts of the pattern - a distributed representation. Humans have trouble visualizing this as we see in 2.5 D. (2D with stereo cues & color) Once we go from the highly topographic V1 cortex it gets hard to conceptualize how the data gets blended together. The V1 cells look like a topographic pattern because the retina feeds it that. Once we get into progressive visual maps it gets less clear what is happening to the representation. I can see how a sort of FFT transform could be spreading and combining different frequencies of sampled data across a distributed pattern in these maps. By the time we get to the association areas we have multiple streams of senses blended together. What is the lingua franca? Grids would work as well as anything here.

I agree that this is highly speculative and that it may be premature to assert that this is the ground truth but it does fit very well with many years of reading about this sort of thing - it just feels right. Win or lose I will be researching to attempt to learn if this is how the brain does it. I strongly suspect that the answer will be this or something very close.

1 Like