Relative Coordinate Encoder


Woke up thinking about the CoordinateEncoder and what would it mean if the squares were relative to a particular point rather than objectively laid out. As an example, imagine an owner who put a smart tag on her pet which would could track him on Google maps. That’s how the current CoordinateEncoder (Geospatial, specifically) might work. But what if the tag sent its location relative to the owner’s phone? It might still need to be an effectively infinite plane but a specific square in the encoder doesn’t equate to a fixed map’s grid. I can imagine a lot of scenarios where tracking an object relative to a reference point would be useful.

I do not think there are necessarily practical implications of using the existing CoordinateEncoder for this purpose. It’s just using relative X,Y values instead of absolute. Instead I am curious what the community thinks about this approach and its implications. Does it violate any assumptions of the CoordinateEncoder or in HTM generally? One implication may be that the object may “jump” in the plane as the observer changes the reference point. This doesn’t happen in the objective use case.

In some scenarios, it might also be helpful or even necessary to track the absolute position of the reference point. Would this be better captured in a different Cortical Column (i.e., as a different sensory input instead of part of the relative-position sensor)?


It reminds me of displacement cells, which can indicate a relative location from a point.


It reminds me of displacement cells , which can indicate a relative location from a point.

Thanks, @rhyolight. Your suggestion had me go back and read the 1000 brain papers again (a worthwhile endeavour). My understanding (and please correct me if I am wrong) is that grid cells, and therefore displacement cells, represent “a location relative to the object being sensed, not relative to the person sensing the object”. Grid cells providing the where on the object the sensor’s data is coming from and displacement cells providing the relative location of features (or the same feature after movement). All of this in the context of object recognition and/or composition.

This seems different to me than the traffic flow example in the Geospatial Cooridnate Encoder video. Given a fixed, albiet infinite, cooridnate system like the city map, and a sequences of drives to and from work, one can predict the likely pattern and, therefore, anomalies. If it took me 5 extra minutes to get from the McDonald’s to bank then there is unusual traffic in that part of the map. I am not trying to build an object model of the car or the road segment. If I used grid/displacement cells, am I modeling the concept of “unsual traffic”? Is the “object” I am trying reognise a specific traffic pattern? Does the fact that the feedforward data are tied to an objective (not relative to the observer) locations impact the predicitons?

Perhaps a different example would help illustrate my question. Instead of a pet relative to the pet owner. What about a boat on the seas? In a simple construct, given the wind direction and speed, and the boat and target locations, one can establish sequences from prior voyages and begin to see patterns of getting from point A to point B. This is all seen from outside the space on a single grid system. A is always at X1, Y1; B, always at X2, Y2. Given a wind coming from X3,Y3 at K knots, we can record the path of the boat at intervals. Based on previous crossings, if the boat is not making adequate progress from A to B then we might identify an anomaly (maybe the sail is damaged?).

What if the boat is going from point C to point B. Point C is at X3,Y3, which is the same distance from B as A is. Let’s say the wind angle and speed, relative to the boat, are the same as the previous voyage when it was leaving from A. Given the distance and wind direction/speed, we’d expect the same travel pattern from C to B as from A to B. But the cooridnate encoder would produce a very different encoding for a boat at A than it would for one at C, as the starting location and wind direction would be wholly different values. To the HTM network, A->B would be a different sequence than from C->B, correct? If so, wouldn’t this mean that the HTM network would not pick up on the similarities of the pattern?

But … if I am standing on B, with the wind at my back, and I see a boat directly infront of me, then I can estimate its arrival time as well as whether something is wrong (coming in too slowly or too quickly). I do not care if I am currently looking in the direction of A or C. My “sensor” data doesn’t have (or need) the absolute position of the boat on the grid, just its and the wind’s relative positions to me. A relative encoding would put the boat at an X,Y that is an offset to my position at B. The wind direction and speed would also be relative to B as well.

NOTE: I am not trying to solve this specific problem, just thinking generically. If there is a flaw in the example, feel free to ignore it.


You are right, they are completely different. We did not know about grid cells when we created that encoder. It was not based on biology.

1 Like

You are right, they are completely different. We did not know about grid cells when we created that encoder. It was not based on biology.

So are the Coordinate encoders deprecated in favour of Grid cells?


I would not say deprecated, but you have more options. There has been lots of work on this recently, check these out!


The coordinate encoder is biologically plausible. It represents place cells in the hippocampus.

1 Like

Maybe, but that is not what we were thinking about when we created it. In HTM algorithms, we make a strong effort to understand that biology of the brain before attempting to recreate the algorithms. In the case of the CoordinateEncoder, this was not done. It was just an idea Jeff had that was put into place, not based on any neural biology. It is interesting and serendipitous that a theoretical similarity exists here. :slight_smile:

1 Like