This is the second of my tangential theories based on HTM to explore what I believe to be capabilities that could help in simulating parts of the sensory motor system. As mentioned in my last thread, the purpose will be to collaborate in exploring the idea in hope of distilling useful information.
This second theory involves utilizing elements of HTM in a way to allow learning an ordered sequence out of order. One way to think about this is learning a piano piece that has several difficult sections. Initially you may skip over the difficult parts just to get a feel for the piece, and then come back later and train extensively on those areas. The point is that although you learned different sections of the piece out of order, the piece as a whole still has a defined sequence which has been learned.
A key element of this theory is the concept of “current position” in the sequence. Current position could be any of a number of things. For example, it might be the slider bar on a media player, indicating a time stamp in a song or video. It might represent positions on an object that are organized in some ordered sequence. For example, a sequence of 4 elements could represent four sides of a box. You could learn any of the features in any particular order, but the sequence itself (i.e. the position of the features on the object) remains ordered.
A four-sided cube is a simple way to visualize the concept, but it could be extended to involve much longer sequences. If you think about the sequence as a string of features, that string could be wrapped around an object in some fashion, allowing a detailed depiction of the object and the position of features on it.
In this theory, “current position” consists of an SDR representation of mini columns, just like an input in the sequence. Cells within the columns will depict context (I will explain that in a bit). This theory assumes that some other system or process not covered here will provide us with the current position (i.e. which columns should be active).
In the examples below, we will work with a sequence of 4 elements. If it helps to think of the sequence as features of an object, see the depiction of the cube above. We will be working with two distinct features (A and B), arranged in sequential order “A B B A”. We will be learning each element of the sequence out of order.
First, we will depict each element of the sequence as two SDR’s. One will represent the current position, and the other will represent the feature at that position. (there is no particular physical arrangement required). For the purpose of visualization, I’ll depict the sequence as a row of four position SDR’s over four feature SDR’s.
This arrangement or separation is not necessary for the system to work – the columns and cells that represent the position and the features may be intermixed. It just makes it easier to visualize with the above arrangement.
Iteration #1
The first step is to receive input about the current position. This can be any random position in the sequence. The current position will consist of some columns to become active. In this case, we will assume the current position is 3. Since this is the first input, all cells in these columns will burst:
The next step is to connect some cells in this position to cells in whichever position was last active (the normal HTM sequence learning process). Since this is the first input, no new connections are made.
The next step is to take the feature that is input at this position, and treat it as if it is the next input in a sequence (the first input being the “current position” from the previous step input, and the second input being the feature at this position). In this case, the input will be “B”. Since this is the first input, the columns that represent “B” will burst. We connect with some of the cells that fired in the “current position” (the normal HTM sequence learning process). Unlike the normal HTM sequence learning process, however, we also connect the other direction (so cells from position 3 also connect back with cells from input “B”) This is equivalent to learning sequences “3 B” and “B 3”.
Next, we activate the position before the current position. In this case, we activate the columns representing position 2. Since we have not encountered position 2 yet, the columns for position 2 will burst, and since no feature has been connected to that position, no cells will enter the predictive state.
Next, we connect some of the cells that fired in the “current position” to cells that are in the predictive state as a result of activating the position before the current position. In this case, no cells are in predictive state, so no new connections are made.
Those are the steps which will be repeated for each new input. The idea is to learn the context of the positions: “Position 3 with Feature B in a sequence where Position 1 has Feature A”, and to activate the previous position in each step in order to learn the sequence. After enough iterations of the process, the associations will be linked like so:
I’ll walk through a couple more iterations just to repeat the process a bit.
Iteration #2
Let’s assume the “current position” now changes to position 1. Since we have not encountered this position yet, the columns will burst. We connect some cells in this position to cells in position 3 which was active last.
Next, we take the feature at this position (in this case “A”). Since this is the first time we have encountered this position, the columns that represent “A” will burst. We connect bi-directionally between some of the cells that fired for position 1 and some of the cells that fired for input “A”.
Since position 1 is the first position, we skip the step for activating the position before the current position.
Since no cells will be in predictive state, we do not make any new connections.
Iteration #3
Let’s assume the “current position” now changes to position 4. Since we have not encountered this position yet, the columns will burst. We connect some cells in this position to cells in position 1 which was active last.
Next, we take the feature at this position (in this case “A”). Since this is the first time we have encountered this position, the columns that represent “A” will burst. We connect bi-directionally between some of the cells that fired for position 4 and some of the cells that fired for input “A”.
Next, we activate the position before the current position. In this case, we activate the columns representing position 3. Since we previously learned position 3, the cells for feature “B” will enter the predictive state.
Then we connect some of the cells from the current position with cells in predictive state.
That is probably enough iterations to demonstrate the general idea of the process. After enough iterations of the process, inputting the sequence “A B B…” should result in “A” being predicted, as expected. Additionally, the sequence could be “played back” by activating the positions in order and looking at the predicted cells. Another interesting side effect of this process is the cells which represent the positions become encoded with a context of the overall object. This should make it possible to recognize the whole object later by inputting a few of the features and their positions.