An important capability that I need for the HTM-based RL system that I am developing is the ability to extend predictions further and further back through time the more often a particular sequence is encountered. I was initially doing this by walking through the activation history to grow distal connections from correctly predicted cells to cells which were active multiple time steps in the past, using a TD(λ)-like strategy. This strategy has a couple of drawbacks:
-
It requires holding onto a significant amount state information (for cells and, more significantly, synapses), which has a big impact on memory requirements. Memory consumption grows quickly with each time-step that you need to refer back to.
-
Walking the history takes time. This can cause a significant performance hit depending on how far back through the history you needs to go.
-
Assuming you do not have enough memory to hold onto a complete history, or that walking the complete history would be too time consuming, you are forced to limit how far back a prediction can be extended.
I decided to see if I could theorize a strategy which doesn’t require holding onto any more than a couple time-steps of historical state information, and which does not impose an arbitrary limit on how far back in time a prediction could be extended. I also wanted an implementation that at least appears to be biologically plausible (granted I have zero knowledge of neuroscience, so I am sure there are holes in my logic…)
I had a thought about the distal synapses as “coincidence detectors” concept (i.e. if enough of them are activated within a particular time window, the cell becomes predictive). It occurred to me that there is a possible property in a continuous-time system that may have been overlooked in a discreet-time implementation. Namely, some of the active potential synapses during that time window could include not only cells active for the last element in a sequence, but also some percentage of cells activate from the element before that. Those potential synapses might therefore also be strengthened each time the sequence is encountered, and eventually become connected. This would allow the cell to become predictive just a bit further back in time, and shifting the time window slightly to the left. This in turn would allow potential synapses even earlier to be strengthened, eventually also becoming connected and shifting the time window even further to the left, and so-on. In theory this process would lead to predictions being extended further and further back in time the more often a sequence is encountered, with no theoretical limit.
In theory, this behavior could be emulated in a discreet-time implementation by not only creating/strengthening connections with cells active at T-1, but also creating a (configurable) number of new connections with cells active at T-2, up to some (configurable) maximum that is equal to or greater than than the activation threshold. After enough iterations of the sequence, the cell will be predicted from T-2, and the process would repeat, extending the prediction back another time step, and so-on.
Where this theory breaks down, of course, is the part of the process where if a cell is predicted but it doesn’t become active in the next time step, then the permanence of the active synapses which led to that prediction are degraded. This would cancel out the permanence increases that were made with T-2 when it predicted an activation that happens at T and not at T-1. This can be addressed by only applying the permanence decreases if the cell is neither active nor predictive in the next time step (versus only if the cell is not active). In other words, as long as a predictive cell is still predictive in the next time step, the connections which lead to the prediction will be maintained (and possibly strengthened).
I would like to address something that might initially seem to be a problem with this theory, in the case where a cell is predictive for more than one time step, but never becomes active and eventually becomes inactive and not predictive. Since we are only degrading connections with cells active in the previous time-step, what about the connections with cells active in earlier time-steps? This actually works itself out if you think it through – when the connections with T-1 become weak enough, cell will eventually no longer be predicted by input at T-1, and thus the permanence decreases would shift back one time slot. This would continue to bubble back through time as the new sequence is re-encountered.
Consider for example, we have well trained sequence ABCD. The predictions would start out like so:
A: predicts B,C, and D
B: predicts C and D
C: predicts D
D: (no predictions)
Then we begin training sequence ABCF, and prediction updates begin to bubble back like so:
(iteration #1)
A: predicts B, C, and D
B: predicts C and D
C: (D not predicted)
F: (no predictions)
(iteration #2)
A: predicts B, C, and D
B: predicts C and F (D not predicted)
C: predicts F (D not predicted)
F: (no predictions)
(iteration #3)
A: predicts B, C, and F
B: predicts C and F
C: predicts F
F: (no predictions)
As of now, this is just a theory. Thought I would write it up here for discussion before I run too far with it. Is this really biologically feasible? Are there other theories that are better or more feasible, which would have the desired effect of extending predictions back in time? Any obvious flaws?