A Framework for Intelligence and Cortical Function Based on Grid Cells in the Neocortex

To clarify, what do you mean exactly by allocentric and egocentric spaces? Are your definitions of these terms consistent with the definitions here: http://www.nmr.mgh.harvard.edu/mkozhevnlab/?page_id=308?

1 Like

Yes the same. I would rather say that an object is a representation of an objects features only in relation to itself in space.

1 Like

The way I see it, your neo-cortex only stores information in displacement cell modules of those features you have noticed and/or investigated. That’s why a football player knows a football much better than someone who’s only ever watched the game on screen.

The more you have studied the object, the more information is stored in many columns (sensory info + displacement info), the more cross-voting happens when you come in contact with the object again.

When the football player touches a football, (s)he doesn’t need to detect every single feature of the football. Only a few features associated with the correct displacement info are enough.

1 Like

When you see or touch the hinge of your (well-known) stapler and also see or touch the front, your brain also knows how far those features are supposed to be apart. That is allocentric.

When you see the floor of the room you’re standing in, you know how to walk without losing balance, because your brain knows how far your eye-level is supposed to be removed from the floor. That is egocentric.

To know objects and to know yourself is therefore probably based on similar neurological systems in your neocortex.

1 Like

I think there have been experiments with sensory deprivation, for example in terms of orientation lines, etc. Experiments that have shown that in impoverished environments what is represented in cortex can be more limited under such circumstances. I may be misrecalling but I think there were also experiments were the addition of random noise extended the length of development windows even in impoverished environments.

In a way there are not only objects but there is noise, incomplete, and garbled patterns that should not be stored. I think a frequentist like selection mechanism might help in the storing of objects from nonobjects. I also think that the long range connections will create a positive feedback loop, favoring patterns that are part of a larger whole, that is positive selection between small scale patterns that are part of a larger distributed pattern across the cortex(say sound patterns that tend to co-occur with a particular visual pattern, are part of a larger distributed visuo-auditory pattern).

The brain has limited information storage capacity, despite its vast size, and it responds to stimuli within very short time periods, allowing only a limited number of processing steps, as well as a limited number of action potentials to be sent or received between any two areas, only a limited amount of information can go through. The stored and transmitted information is of finite nature, but it appears the representation capacity is high enough to suffice for handling the real world phenomena.

I have a question: and sorry if this has been answered before. Are the displacement cells (L5TT cells) supposed to be position-invariant? Do the displacement cells care about the starting and ending locations? Or do the displacement cells only care about the distance & direction between the starting and ending locations.

For example if I took 1 step forward and then another step forward: then the grid cells would change their activity to represent each location as I stepped through it. Would the displacement cells remain the same because I’m making the same displacement? Or are the displacement cells specific to the starting and ending locations?


Displacement cells only care about relative displacement. The starting cell is not important.

ps. Something to bear in mind… In the discussion section of the frameworks paper we pointed out that the description of displacement cells is incomplete. As described in the paper, they would correctly encode movements, but they are insufficient to encode all aspects of object composition. For example, they can’t represent the logo on the cup if the logo was at an angle. They also can’t encode logos at different spatial scales. We are currently working on extensions to the theory that fill in these missing pieces. I believe the basic idea of displacement cells is still correct, but we have to add in other details related to orientation and scale.


Thank you. I read the paper again and it makes more sense now.

In the discussion section of the frameworks paper we pointed out that the description of displacement cells is incomplete.

From page 6:
Grid cells: (Location1 + Displacement => Location2)
Displacement cells: (Location2 – Location1 => Displacement)

The subtraction seems implausible. I think instead the grid cells have a mini-column structure like a temporal memory, so the grid cells represent a sequence of locations. And then the displacement cells could be modeled as a column pooler. Could the same mechanisms for forming viewpoint invariance in L4 & L2/3 be reused to represent displacements which are invariant to their starting point?

I was thinking that this new framework for intelligence could be used to perform machine translation (and similar tasks). If you think about it, words and sentences, etc., are objects. Even if they have different encodings, depending on the language, they represent the same object. For example, “water” in English has a different “encoding” (sequence of letters) than “Wasser” in German, but they represent exactly the same object. This framework could be used to have these mappings of objects (which can be words or sentences, or even full essays, etc.): we might initially encode the sentence (or word, etc.) in one language and then the sentence in the other language, then we can find if these objects overlap in a certain way. We actually need an algorithm which performs an slightly different operation: given only the encoding of an object in one language, we want to find the encoding of the same object but in another language.

I would like to note that this is related to what Cortical.IO is doing, but not exactly the same thing. AFAIK, they are not involved in machine translation. I am specifically suggesting to use this new framework to perform machine translation. Furthermore, note that I stated that objects do not need to be words, sentences, etc. In general, if we want to perform some kind of mapping between objects, I think that this framework could be used to achieve such a task. The framework may need to be extended. I still need to think about it a little more.

I do not want to be arrogant, but if this turns out to be a useful idea, I would like to at least be cited, provided that no one has already suggested this before me.

I would suggest that if you think this is a useful idea that you should work through it to provide a working example.

This idea is of a large enough scope that it would involve considerable input of hours to make it happen; this kind of idea requires a “true believer” to put in the hours to make it happen.

I would never expect someone else to put in that kind of effort to elaborate a vague proposal like you are offering. It is also vague enough that it would be hard to pair up any object encode/decode model to your proposal.

I think that the next step, in order to be compliant with the biology, would be to investigate how humans actually perform mappings of objects (like sentences), from a neuroscience perspective.

There is already a vast library of work on this topic from a neuroscience prospective.
You may find it useful to spend some time on google researching this.

1 Like

Honestly, idea’s are a dime a dozen. What you have typed here isn’t even close to being formalized enough that someone could actually take what you have written and implement something in software. It is essentially like someone typing, “You know, I bet HTM could be used to predict stock market crashes”. Then someone later does the enormous mountain of work that would actually be required to predict stock market crashes with something based on HTM. IMO, that other person would have zero moral obligation to cite the first person who put virtually no effort into the actual product.


I am specifically suggesting to use this framework to solve a specific problem. Yes, it is an idea or suggestion. If I had a complete work, I would not have shared this suggestion, but the complete work. I knew I would have sounded arrogant (in fact, I wrote that I didn’t want to be arrogant, but I just wanted to be credited, if this idea turns out to be useful).

Anyway, this is not the main point of my post above. If someone is interested in this suggestion, it might be useful to investigate its plausibility and, eventually, develop more concrete ideas.

I think that most of the AI researchers here on the forum are working on these ideas. Most of us have our own view of how words are parsed, stored, combined and produced.

I really did not see anything novel in what you posted but I gave it the benefit of doubt that you have some insight that you had not elaborated in your post. The reason I offered that you work through this is that you may be able to detail whatever insight you that are harboring and show it by a working example.

Cortical IO does not really do much that corresponds to HTM - the learning is all offline, the SOM clustering is not done with HTM, the SDR operations are done outside of cells with external processing. Really - the use of SDR math is the only real overlap and that is not done in a way that is compatible with HTM processing. I would add that it is not done in a biologically plausible way.

I do respect that the word store in the Cortical IO retina is seductive but there is a vast gap between that and what is being considered with the framework proposal.

Now - if you can see a way to combine them AND reduce it to a practical demonstration, even if is was a simple one, you would have a lot of peoples attention.

Saying “you could make a machine to fly through the air” is a very long way away from making an airplane and should NOT get you credit for inventing an airplane.

1 Like

Understood. I was just pointing out that you’ll need to put more work into the idea before you will likely be credited. The basic idea of sentences as objects with semantic meaning is not a new one (I even have a thread on it somewhere here on the forum, and have used it to build a product for my employer Ericsson). I recommend taking the time to formalize your idea and document more of the details on how your idea would actually work. Then folks are more likely to take notice and credit you for the effort you put into the idea.


Is there a pseudocode write up that corresponds to the paper " A Framework for Intelligence and Cortical Function Based on Grid Cells in the Neocortex"?

1 Like

Hi @codeallthethingz thanks for joining the forum! If you look in the supplemental materials you’ll find an additional PDF with more details, but not quite pseudocode.

I’ve been wondering. Assuming that a column is a computational unit, how much of it is hardwired and how much of it is learned? Ability to calculate locations, displacement cells, etc. (in the neocortex of a human) is it innate and built from DNA, or could it be a case that it is actually learned?

Location space and displacements are learned, but the mechanisms are genetic. Scientists are still trying to figure out exactly how things like grid cells / place cells learn space, but we know space is learned. You don’t start out as a baby knowing how to move your body through space, you have to move around semi-randomly for months before movements start making sense.