A little more on temporal memory

I have been reading and rereading a paper[1] that was posted on this site a while ago.
In this paper they construct a bold and powerful model of the visual system with impressive results. Every time I read it I see something new.

In this paper the authors are fully invested in predictive memory, citing Hawkins, J., & Blakeslee, S. (2004). On Intelligence. New York, NY: Times Books.

Where it gets very interesting is that they go on to propose a very different system of temporal memory. They incorporate some very powerful features of back-propagation of error distribution in learning in this model. In essence, they take the oscillatory cycles and consider that different layers are affected based on the phase of the sensed stream vs the prevailing alpha cycle. They consider a plus and minus phase, with the pulvinar providing the substrate for this activity. This uses predicted and actual outcomes with the minus phase being the best guess prediction, and the plus phase being the big reveal. The difference is the delta to drive error-based training. This method is both simple and powerful, and very biologically plausible. This matches up the like this:

“Biologically, within the first 75 msec period of the overall 100 msec alpha cycle, the entire network interactively settles or converges on an integrated representation of the current state throughout the superficial layers, while the deep layers generate their best prediction of what will happen next, and project this to the pulvinar. The full network of brain areas can thus work together to collaboratively produce the best possible representation, with individual pyramidal neurons sending standard excitatory signals to other pyramidal neurons, amid a background of dynamic surround inhibition. Then, when the plus-phase outcome state is experienced over the last 25 msec of the alpha cycle (driven by burst firing of deep layer 5IB intrinsic bursting neurons that send strong feed-forward driving inputs to pulvinar thalamic relay cells (TRC’s), as elaborated below), any differences between this outcome state and the prior prediction state are experienced as ripples of propagating activation-state differences emanating from the pulvinar and penetrating throughout the network.”

They go on to cite a considerable number of supporting papers that look solid when I read them.

So I am left with some choices; The authors are simply wrong, HTM is wrong, or both models are working at the same time.

Assuming that both camps are correct one intriguing possibility emerges - the layers are affected differently due to the layer choices of input and output connections; It is possible that the feedback path learns differently than the feedforward path.

[1] Deep Predictive Learning: A Comprehensive Model of Three Visual Streams
Randall C. O’Reilly, Dean R. Wyatte, and John Rohrlich
https://arxiv.org/abs/1709.04654

1 Like