Tracking position with dead reckoning

In a couple of my earlier threads, I mentioned the concept of utilizing a representation of “current position” to learn positional information about the features of an object (Learning an ordered sequence out of order and Learning a collection of features where order doesn’t matter). In those threads, I mentioned that this information would be provided by some external system or process. This is one possible method for providing positional information using the concept of “dead reckoning”. This idea may have no basis in reality, but I feel like there could be useful information in exploring the concept.

For those who are not familiar with dead reckoning, the idea is that one vector can be represented as a series of other vectors. This means in the absence of any landmarks, current position can be determined by keeping track of every movement made after some known starting position. For example, if I take 1 step forward and 2 steps back, that is equivalent to taking 1 step back. Rather than performing math (1 - 2) or having an internal GPS sensor, instead I can track all the intermediate steps and arrive at the same result. Obviously this was a simple example – the math can get much more complicated when you consider rotational information and multi-jointed mechanics. By continuously tracking each movement made, I can maintain an accurate current position.

One problem with this system is that errors build up over time. This means of course that I either need to have very accurate (high granularity) inputs, and/or I need to “reset” the system periodically. For example if you are walking around a forest, you may have a good general idea of where you are at, but you are able to “reset” your position by encountering familiar trees or rock formations. I have not yet formulated a good method of doing these “resets”.

That said, this basic concept has a lot of potential uses. It could be applied to features on an object being touched, for example. Perhaps a simpler way to visualize it, though, is if the object as a room, and you are moving through it and encountering features along the way. I will explain the concept initially in terms of 2 dimensions because it is easier to visualize that way, but keep in mind that the concept can easily be extended to 3 dimensions.

For 2 dimensions, the process involves three regions of cells. Two of them are long strips of cells to depict heading and velocity information. The third will be a large sheet of cells for depicting current position and for predicting the next position based on input from the other two regions:

This system relies on the concept of a “starting position”. How and when the starting position is determined are questions that I still need to work out. This will likely be related to the “reset” process I mentioned earlier (getting a strong indication of accurate current position, such as seeing a familiar tree in my forest example, might reset the starting position back to the center of the sheet). For now, let’s assume the starting position is the center of the large sheet, and doesn’t get reset. Of course this means that there is a limited distance I can travel in any given direction, so I’ll need to tackle that problem later.

You will notice that I have reserved multiple cells for each value (4 cells for each for heading + velocity region, and 9 for current position). These numbers are completely arbitrary and could be adjusted. Fewer cells would reduce overall memory usage but be more susceptible to noise and errors. The heading + velocity regions in practice would probably also be much longer, for tracking a finer granularity of velocities. The large sheet for position would also have a lot more cells in it, so that larger distances and finer granularity of position could be tracked.

The heading + velocity regions are used to represent a velocity vector. For example, moving slowly to the right:

Moving faster to the right:

Moving backward and to the left:

One thing you will notice here is that we are only handling translations, not transformations. In other words, we are not depicting rotation anywhere. Rotation is actually important though (for example, I sense direction with my head, which I may turn to the right or left while walking). I have some ideas for handling transformations, but they are not simple enough. I’m hoping to devise something as simple as the way I’m doing translations. I’ll post an update when I figure out some better options for that.

These inputs will cause the cells in the current position sheet to predict the next position. Moving slowly for example:

And moving a bit faster to the left:

As you can see, if I update the current position cells at a constant rate, and keep track of my current heading + velocity, the current position will be a constant representation of where I am in relation to the “starting position”.

This is good for a scenario where there is only one moving part (the whole “1 step forward, 2 steps back” scenario). But what about a system where there are multiple moving parts? For example what if I lean forward while taking a step back? My “current position” (assuming it represents the position of my head) may not have changed in this case. Well, I can use the dead reckoning concept to add up all the smaller movements and arrive at the starting position. That could be done by cycling the values in the heading + velocity strips between the various inputs:

And back to where I started…

Another way is to have additional heading + velocity strips which cascade up through some hierarchical structures which ultimately drive global heading + velocity strips. Since something like this is probably also needed for handling transformations, that is the route I am looking at. I will post an update when I have some ideas for how to do this.

The final piece of this concept is (as you have probably noticed), the next position should be predicted based on the current position and the heading + velocity inputs. This allows each cycle to shift cells to the correct current position. These predictions are done by growing connections between cells in the large sheet and with cells in the two strips, and placing cells into predictive state like in other areas of HTM. Considering the symmetry of the system, the connections could probably be calculated mathematically, but an easier way would be a training phase to teach the cells in the sheet what inputs should result in their becoming active next. There may be some way to have this information learned through experience, but I don’t expect it is too far fetched to assume that systems like this can be pre-programmed (for example, wildebeest infants are able to run within hours of birth – that must require a lot of pre-programming)

Ultimately, the large sheet of cells depicting the current position can be used as an input to a spacial pooler for use in learning positional information about features of an object. See the two threads I mentioned earlier for some possible use cases for positional information. I am working on some apps to explore practical uses for some of these concepts (the first app will be a “robot” that can learn to navigate a 2D maze). I’ll post links to those for reference and discussion when they are complete.

1 Like

I’ve given some thought to the problem of how to handle rotations as simply as the method for handling translations I described above. So far, I’ve come up with an idea which would work well for simple joints that rotate along a single axis (such as an elbow, or the last two joints of a finger). I’ve not yet got to something simple for joints that rotate on multiple axis (such as a wrist or shoulder joint), but thought I would share what I have so far.

The first thing to recognize is that rotations (like translations) can be added up using a dead reckoning type of strategy just like above to end up at the correct position. In the case of rotations, though, you are dealing with arcs rather than with straight lines.

For this discussion, I will use a simple system of two joints. As I mentioned above, they need to be simple joints that rotate along a single axis. This is something like the last two joints of your finger

If we look at the first joint (the one closes to the hand if we are imagining this as a finger), the full range of motion of this joint can be used to trace out an arc. The longer the arm, the larger the arc.

Now we can visualize a row of cells being lined up along this arc. Note that for the system to work, there isn’t any requirement for the cells to be positioned like this – they could be randomly mixed in some large SDR as long as the semantic relationships between them is retained. Lining them up this way just makes it easier to visualize and explain. Also note that in practice, you would probably want to use more than one cell per position on the arc, for less susceptibility to noise and errors.

If we also look at the full range of motion of the second joint, we can visualize a second arc of cells (not my best drawing… hopefully you get the idea):

If you are imagining this as a finger, just pretend this finger belongs to someone who is double jointed :slight_smile: I used this particular range of motion to better explain my next point without having to use a system with three joints. Note that the system works with any number of joints, but for explanation purposes, two joints is simpler to talk about.

Now what we want to do is use these two arcs of cells in a way that we can output a semantically identical output representation for orientations in which the sensor (tip of the finger for example) is in the same physical position:

One way to accomplish this is to overlay the two arcs, and use that to trace out a strip of cells representing all possible positions the sensor could be located at:

Any orientation, represented by the two arcs, can be represented as a position on the strip. For example, the following orientation:

Could be represented like so:

Similarly, we can get the identical representation we want for the other orientation with the same position:

So how to get from the two arcs to the strip correctly representing the position? One way would be a similar method as my previous post where cells in the strip have distal connections with cells in the two arcs, and simple sequence memory to predict a position based on the two inputs. That might be easiest for the two-joint system, but might become difficult to do for more joints (the more joints you add, the more likely for false positives.

Another way would be to cycle through the arcs one at a time from inner to outer, in a multi-step sequence memory type of process, something like:
-> ->
-> ->

More steps involved, but we end up at the same result (and it scales better with more joints).

It occurred to me while looking at the above drawings, that if I were to look at the strip of position cells as a series of the smaller arcs with cells that overlap with some of the other arcs, and straighten those arcs out to form columns, I would have a system of columns with some cells in it that have proximal connections with each other for each position. Using the above two-joint system as an example, the two orientations I listed could be represented like so:

This is effectively an SDR representing the position of the sensor, and encoding the two contexts for how the sensor can arrive there. This output could in turn be used as an input to another similarly constructed layer for capturing orientation information about a third joint, and so on. The result would be an SDR with bits for all possible orientations that add up to the same sensor position. An intriguing idea I think I will explore further…