Numenta Research on 3 Visual Stream & Deep Predictive Learning

Where we try to make sense of Deep Predictive Learning: A Comprehensive Model of Three Visual Streams.

Wed, Jan 22 at 10:15AM PST.


I see the potential in one part of this model to address a gap in HTM theory that Jeff highlighted around 5:32 in this whiteboarding session with Matt. Of course that video is discussing an older (by several months) version of the theory, but at least in my current understanding, the problem of how to transmit an active prediction back up to L4 is still an open question.

In Figure 1 of 3VS paper which you discussed briefly in the research meeting, there is a path for active prediction to transmit from L5 back to L4:


I feel like keeping this potential mechanism in our back pocket could prove to be useful while theorizing data flows in a cortical column, and for making predictions that can be further researched to find evidence for or against.


Nice job @rhyolight! That was a very challenging presentation.

I have opinions in this paper’s model and is highly influenced again by some chapters in this book - For the record, it is an old book and its ideas are old which coincides with one of Jeff’s comments in the video.

Not that these opinions matter, but I would like to share them anyways without going into much details.

  1. The guy with the red checkered shirt has simply described my understanding of this paper, and I’m happy. Such that the paper showed a biologically plausible (in their claims) model that allows implicit gradient descent IOW as they call it “error-driven learning”. One of the inspirations of this type of learning is described in the said book which says;
The learning function we adopt for the models in the rest of this text is 
called the eXtended Contrastive Attractor Learning (XCAL) rule. 
(The basis for this naming will become clear later). 
This learning function was derived through a convergence of bottom-up 
(motivated by detailed biological considerations) and top-down 
(motivated by computational desiderata) approaches

The top-down part is interesting to me as this was my first hunch about a core design of the paper’s model - computationally influenced.

  1. Two major types of learning considered in the book were Error-driven and Self-organizing. The latter exists in the HTM model however the error-driven cannot exist.

  2. So let’s assume that the paper’s model at some point will mature and say become much more biologically plausible, everybody is happy. At that stage, I can already somehow predict that the conclusion will converge to that error-driven learning is very powerful and it has to exist in the brain whether that is explicit or implicitly done. In this paper, error-driven learning implementation is computationally and approximately equivalent to backpropagation’s gradient descent except that it is done implicitly. Now the next question, for computing people, how can we increase the performance of this gradient-descent-like function? I think the answer will be to use gradient descent itself (e.g. the ones used in backprop) and now everyone is back to mainstream ANN’s backpropagation which we all know that doesn’t work all the time. But isn’t this the inspiration of the paper or the book because in the first place these guys are computational neuroscientists?

By the way, the book is much easier to read and it provides a good background as to the inspiration of the model described in the paper.


For those interested in pursuing this further there is a modern implementation of the framework in Go with a Python front end: