Performing basic math with grid and displacement cells

This is just AI hype. I read the paper. It’s just Deep Networks doing what we’ve always known they do. Saying they have “learned to tell the difference between big and small” is an overstatement.


From neurobotany, a very simple example of counting:

Entire mechanism unfortunately seems to be mostly unknown. But at least some counting ability is possible without needing a complex brain.

Right you are, at the end of the piece the doubts about the validity of the statement are articulated by the sceptical associate professor Peter Gordon. There is no notion of bigger/smaller, previous/next or any embodied knowledge for that matter, just a symbol that represents the scene. Nevertheless I do find some inspiration in these kind of romantic AI articles.

The question on the OP remains open, what does it take to make a HTM “understand” numbers and do operations with them.

1 Like

I think we need to think about mathematics more like language, in the same way music is referred to as a language. Languages are abstract conceptual objects, each one with a unique frame of reference that intersects our sensory representations in ways unique to our individual experiences with reality.


Remember that mathematics is a cultural creation

I can’t wrap my head around this. Bees can count using 4 brain cells. It seems that math is in the root of all HTMs. Even HTMs without a neocortex are able to count. Algebra is a lot more complex and abstract than simple arithmetic. Perhaps, algebra can be seen as a cultural creation but not in the literal sense. Culture accelerated its growth.

Ok, but I don’t think that last part is correct. Our mechanism does not account for it, at least. I don’t think the brain is doing it. Empty space is simply not represented. Zero is an abstract mathematical concept.

But if you are saying that information exists within the representation about empty space, simply that the lack of sensory feature existence at a location implies empty space, then yes I agree with you there. But it is implicitly represented, not explicitly represented

This means that you need to “conteplate” about empty space in order to understand it unless the brain can sense it and represent it explicitly.

Bees can even choose the value of zero, when trained to select the lesser of two quantities.

I can’t believe that bees naturally grew an understanding of 0 and empty space by contemplating implicit fundamental questions while developing their cognitive maps. It didn’t come at birth but it certainly isn’t a hard concept to use.

What I’m proposing is to associate empty space explicitly to objects that have been recognized. For example, let’s say we 've trained an HTM doing visual inference and showed it the Numenta cup floating inside an empty room. If the cup wasn’t there nothing would stored because there would be no sensory input to activate a representation that will later be stored. When the cup is visible this HTM should be able to recognize every object as both an object AND an “empty room”.

Cup AND “empty room”
Cylinder AND “empty room”
Handle AND “empty room”
NumentaLogo AND “empty room”
Cup AND “empty room”

You can only store the “empty room” associated with something else not by itself. The higher concept of 0 or empty room is represented like any other concept. The odd thing is that its associated with every recognizable object but not as an extra information. Every recognizable object is a unique room filled with sub-objects. The fact that it’s empty while it’s filled is the extra information.

If an “empty room” isn’t the commonality then something else must be. Even the fact that all objects are place cells in L2/3 should be considered a commonality.

When there are 2 recognizable objects in a room that share a commonality:

Object1 AND Commonality
Object2 AND Commonality

Allows for asking this question:

What 2 simultaneously active “Commonality” objects is associated with? How will it move?

The associated object (by looking at grid cells) is the abstract room of the numerical object “2” and can be moved like this: addition, subtraction, multiplication (by looking at displacement cells).

Change this to “nothing”, because an empty room is something. It took me awhile to respond, sorry!

1 Like

What I’m proposing is to associate empty space explicitly to objects that have been recognized.

Not a unique “empty room” representation for every object but a single one that refers globally and is connected to all of them. Any activation of an object would activate the representation of the “empty room” hence its always active.

When you where learning the number 7 you would construct it as a room with these 7 numbers placed inside it:

0 = {} AND “empty room”
1 = {0} AND “empty room”
2 = {0, 1} AND “empty room”
3 = {0, 1, 2} AND “empty room”
4 = {0, 1, 2, 3} AND “empty room”
5 = {0, 1, 2, 3, 4} AND “empty room”
6 = {0, 1, 2, 3, 4, 5} AND “empty room”
7 = {0, 1, 2, 3, 4, 5, 6} AND "empty room"

Any recognition of an object would recognize an “empty room” so this would apply to numerical objects also. I would assume that since there’s the rule of 100 cells for each task there can’t be more than a handful of object recognized in each cycle.

*Note: You would learn the transformations performed by operations of addition and subtraction simultaneously while learning what numbers are. You would model how the global/general “empty room” representation behaves instead of relying on unique incidents of everyday obejcts moving in and out of spaces.

Now, for an example: When 7 cups are shown to you they are recognized throughout a sensory sum of patches:

Cup1 AND "empty room"
Cup2 AND "empty room"
Cup3 AND "empty room"
Cup4 AND "empty room"
Cup5 AND "empty room"
Cup6 AND "empty room"
Cup7 AND "empty room"

This way, it would be stored as 7 cups AND their associations to the “empty room”. This is similar to 7 numbers and their associations to the “empty room”. Since, both cups and numbers are associated to the “empty room” you would be able to associate the group of objects you are currently observing to the number 7 as previously performed when you were learning what the number 7 is.

The “empty room” is the default representation of a room before it has any features associated to it:

Empty Room = True Nothing @ Empty Room Locations

I suspect that a better way to think of this is to focus on the concept of object representation as a collection of features. One common collection of features is the default set when no particular object is present to bind to - nothing.

Language cannot communicate the true essence of “nothing” without assigning something to it. So, we are stuck on whether “empty room” is something or “nothing” is truly nothing.

  1. Nothing has structure because it’s a word but true nothing has no structure.
  2. There’s always space masking the appearance of true nothing but you can also argue that true nothing has no appearance hence it’s always apparent. An empty cup is filled with air, the vacuum of space is filled with sub-particles and even true empty space is filled with spacetime. True nothingness can’t be observed through sensory input when we are bound to live inside a space-time continuum.

Black Dot = Commonality + Black Dot Features @ Black Dot Locations
Democracy = Commonality + Democracy Features @ Democracy Locations

We all know this intuitively. This is all based on true nothing not the abstract nothing we infer is “void”, “emptiness”, “air”, “spacetime”, “invisible aether”.

If “empty room” cells were active in every object representation, they would provide no semantic meaning to anything. Inputting “empty room” with no context would predict everything.

Since every cell can’t physically activate at once in a real brain, you would just end up with a random set of cells at the maximum allowed density.

One could simulate exactly the same behavior by not having an “empty room” representation at all, and instead activating a random sampling of cells when there is no input.


An input of “empty room” doesn’t make sense. It’s not a feature you can recognize and if it’s not a feature then it can’t be an input. When there’s no input the cells that represent “no input” can’t be activated. I suppose the “empty room” is a byproduct of path integration. Once an organism moves (becomes intelligent) he has to deal with the concept of empty space.

A definition of an object is a set of locations that are connected together by movement (via path integration). Some of these locations have associated features and that defines how you know which room you are in but not all of them. You don’t have to have features at everything here you just have to have some.

If I understood Jeff’s words correctly there can be locations with no features associated to them:

"empty location" = A location with no associated features.

"empty object" = A set of locations with no associated features.

A single location is also a set of locations. Therefore, an “empty location” is an “empty object” or “empty room” and should have its own representation in higher regions of the hierarchy.

Every visual object has lines. This doesn’t mean that the cell that represents a line is connected to all objects in the brain. So, when you look at a plain line it predicts everything.

It could be that L6a is doing associated counting which is related to sensory input directly and L6b is doing algebra which is related to sensory input but in the reference frame of higher concepts.

I think a line is itself a feature/object. In other words, if I see just a line, then the object that is predicted is a line (not every object that contains a line). If I sense (do not sense?) truly nothing, then I do not predict anything. These seem different to me.

My current thought process keeps coming back to the fact that you really don’t learn things by just passive exposure. Objects/concepts are built up by moving in some space (physical or conceptual). Therefore, I don’t think you really need to apply a representation for “nothing” to everything. When I am moving my finger through empty space to explore an object, those motor commands (ability to move through some areas but not others) are fed back and become part of the object model. This is a much more rich/complex type of “empty”, which can also be modality-specific. For example, you can see through glass, but not reach through it, or you can feel an object behind an opaque panel but not see it, or you can hear a cricket chirping “right there” but be unable to find it.

BTW, I find this to be a very interesting topic, so hopefully my comments are helpful in solidifying your concept, and not just discouraging.


I think it depends on your interpretation. It could also mean that the mechanics to observe space lead you to a location where there happens to be no sensorial input. Your neocortex doesn’t store that data, but there are grid cells that “code” for that location, as there are for a huge potential space.

If later at that same location a new feature is introduced (for instance you fill the empty coffee cup with sand), then your brain creates new SDR sets for that location that previously had none associated to it.

It changes from a non-coded location (empty) to a new stored feature (gid cell / displacement cell + sensor data).


I think it depends on your interpretation. It could also mean that the mechanics to observe space lead you to a location where there happens to be no sensorial input. Your neocortex doesn’t store that data, but there are grid cells that “code” for that location, as there are for a huge potential space.

Exactly this. I don’t know why it came out that we might be referring to different things.

I don’t think you really need to apply a representation for “nothing” to everything.

We don’t apply anything extra. The “nothing” is already part of the model, it has to be a part of the model for the mechanism to be complete and it is identical to the location without a feature you need for path integration to work. If you can imagine a location without a feature being an object then this is it:

Object = SetOfFeatures@SetOfLocations

OneLocation = SetOfLocations

EmptySet = SetOfFeatures

EmptyObject = EmptySet@OneLocation

EmptyObject = OneLocation

The difficult part is the way this could be done without flooding the system with “empty objects” because this is what our intuition says. Since, the world is practically infinite and almost empty we should be constantly overwhelmed by the practically infinite number of locations without features but in reality we might be storing only the “empty objects” we 've sensed and passed through with our senses not what we later infer is too large and apply our intuition on.

An example:

When you observe an empty cup your eyes inevitably saccade to locations where there are no features, the empty container inside, the empty container of the handle ear or the empty space above the handle. You may observe 50 locations each one with an associated feature to it of the same or different part of the cup but you would also observe some locations where there are no features, while performing the observation. When your fingers move to sense different parts of the cup they pass through a lot of empty space in order to reach those features. You practically pass through the whole “atmosphere” of the cup when you buy and drink a coffee. There’s a lot of motions involved, picking it up repeatedly, throwing it away, moving it while you move, etc.

The capacity of grid cells with a lot of modules is practically infinite for us to be cautious of storing too many empty locations we 've passed through, naturally associated with a sensor patch. “empty” means no feature so this would imply that it would be impossible to associate it to the cup but the location is physically the same location the cup is located at.

Also, the number of neurons is fixed. If naturally 5 out 50 of the locations we 've passed through are “empty objects” why not store them associated to the cup?

Where Pathway

You know how to reach things and where they are relative to you. You can’t recognize what things are.

What Pathway

You can’t reach for things. You can recognize they are out there.

The “empty object” can’t be involved with the What Pathway directly. This makes absolute perfect sense because an “empty object” means a patch of sensor received no input and that is never true. You don’t see the emptiness above the handle you see the object that takes the visual space behind and above the cup (a computer, a wall or anything else).


The whole idea is that locations with no associated features are omnipresent in representations about objects in the where pathway. This commonality can be used for associated counting mechanisms. Interestingly, the where pathway is where all the necessary neural structures (grid cells and displacement cells) are available. This should naturally emerge from the necessities of path integration if I understood it correctly.

1 Like

Yes, those locations are represented in the brain’s representation of space. But they are not explicitly represented. They are represented in the same way a cup contains empty space within it. The fact that we have moved our sensors through that space and perceived nothing is relevant, but unnecessary to store. It is simply a part of the object (or the lack of a part). The emptiness is not explicitly coded. It is inherent in the structure of the model.

I don’t think this mechanism is even possible unless the “emptiness” concept is explicitly coded, which I have argued all along in this thread it is not.


The fact that we have moved our sensors through that space and perceived nothing is relevant, but unnecessary to store. It is simply a part of the object (or the lack of a part).

You don’t observe “the nothing”. I’m not saying the lack of features is directly related to causing us recognize the “nothing”. We later developed this understanding but in the beginning we could be able to use it without knowing what it is. A bee doesn’t need to know the flower is devoid of nectar, the lack of it is enough. It doesn’t know about “nothing” in that higher sense nor does it know about any other hidden mechanism like rooms and grid cells.

I don’t think this mechanism is even possible unless the “emptiness” concept is explicitly coded

We wouldn’t be able to code “emptiness” explicitly even if we needed to in the lower regions that deal with sensory input because it would be impossible to sense. Sensors need to do nothing in order to sense nothing but obviously when they are bound to do nothing while sensing they can’t store what they 've sensed or function to survive for that matter.

I think you still don’t understand.

If a grid cell module produces blips every 5 inches, then blip blip blip blip blip -> you are now 25 inches away. If there is nothing there to observe, your brain doesn’t store any information. The module still blips when you cross that point in space, but no changes apply. Not to the grid cell modules (they keep blipping the way they always did - that’s what I mean with a location that is coded for) and not to the SDR that receives the grid cell input but no sensory input for that location.

-> no change

If at the fifth blip now something new is detected (a physical feature), then in layer 5 of some neocortical columns new synapses are producing connections. A modified SDR takes shape, combining the grid cell input and the new sensory input.

-> change

Next time you cross that point, the new SDR combined with the grid cell modules will predict something, because the brain stored that information in the SDR where previously there was nothing. Not in physical space, and nothing in the brain for that location.

Edit: I forgot to read your last post, which makes me wonder what we are arguing about. But then I read this:

Locations without sensor data have no encoding of their own. Only grid cells fire when you cross them.

Edit 2: I think I understand where you might be confused. The representation of an object is not a huge 3D model of all possible positions, leaving placeholders for the “empty” space. It’s a bit like a linked list of features each with their relative coordinates. If the object has few features, the list is short. Even if the object is big and the features are far away from each other. All the spaces in between don’t get added to the list. It’s like some intrinsic compression. If you detect new features, a new data set (coordinates + sensory input) is added to the list.

Does this make more sense?

OK then, please outline what happens when I look at a dice and see FIVE dots.
Where does the five part register?
The same questions for anything where we see some shape and associate it with a quantity.
Is it grid cell math?

Is this the same places that takes two dice with FIVE dots and TWO dots and comes up with SEVEN total?

1 Like

Here’s my speculative take on it:

When I look at a standard size red dice with the face with five white dots most prominently oriented in my direction, my senses detect a few hundred data points.

Most of those points get processed and normalised through the layers and regions of the neocortex and end up in a distributed cluster of bits that together form a number of SDRs.

One of these SDRs for instance codes for a cubic shape. Edges and surfaces are in the correct location for most cubes my neo-cortex knows, but not all. The edges are rounded, so those bits in the SDR for cubes that code for straight edges don’t fire, since this dice doesn’t have those. Nonetheless, looking at the dice will make my neocortex associate it with a cube.

One of these SDRs codes for a red object. This SDR is shared with other red objects, like cherries and strawberries and that Ferrari GTO model on my drawer. But a red dice fits into that SDR, so there is a binding link. It’s not inconsistent with my model of a typical dice, that this one happens to be mainly red.

One of these SDRs codes for white spots, an SDR that is shared with other objects that have white spots, like some mushrooms or dust particles on a black surface or the unfortunate holes in my favorite pants.

One of these SDRs associates the pattern of spots as a five. It is shared with other clusters like five spots in a row, five spots in a circle, five squares, five stars, the arabic numeral 5, the roman numeral V, and the written word ‘five’. Each one of those other fives in the SDR have numerous bits that don’t fire for this cluster of five white spots on our dice. But enough do coincide, so that the neocortex recognises the five.

All these SDRs are like circles in a huge Venn-diagram with massive fault tolerance. One of the intersections on the Venn diagram is our red white-dotted dice with the five on top.


I’m sorry I have abandoned this the past few weeks. I was busy with exams and moving. I think it’s pointless to live in the periphery of our imagination without an actual experiment that will establish what can work.

OK then, please outline what happens when I look at a dice and see FIVE dots.

I don’t have a precise answer to this question yet. I will build the calculator app with NodeJS or Objective-C using only layers of columns, grid cells, displacement cells and whatever new comes up within the next 4-5 weeks and I will post it here including a detailed analysis of how it works. :man_technologist: