Jeff’s talk @ Simons institute


Does the Neocortext use Grid Cell - Like Mechanisms to Learn the Structure of Objects


I think this is the link you want.


Posting before finishing video.
Composite objects! This is the thing missing from ANN so far. This is exciting.


All too fresh, the link is updated!

You link seems to go to the wrong video. The correct link is:

In any case, at 48:38 @jhawkins says: “I never think about what is
mathematically expedient. Physicists think that way.”



Explain the neuro pill? What is Jeff saying?

Your talk at Brains, Minds, and Machines 12/15/2017
Dec '17

Hi Jeff, I enjoyed your talk at BMM. Sequential memory may also offer a way to learn physical structure. As we scan our eye from the front of a dog to the end of a dog we see a head then the trunk, and then the tail. Similarly as we scan down from the truck we see legs and then ground.

When asked to generate pictures of dogs GANs produce what Yann LeCun calls Salvador Dali dogs. They have features but lack structure. Sequential order seems a powerful way to represent many types of structure.

On you opening remark you are all neuroscientist, I am an engineer interested in building “Machines of loving grace” to quote the poet Richard Brautigan.


Great talk, always nice to get more video talk content.

Would be interesting to see what the parallels are for something like language and speech with the structural modelling comments made regarding in this case the example of the coffee cup.


Hi Guys,

first of all, thank you for sharing this!

In this presentation Jeff shares a picture of a cortical column (added below) on one of the slides. I see this particular picture for the first time (in terms of labels and how many layers are marked) but I’m curious - do you know an answer to any of the questions below?
1, Labels - What do the labels mean (I.e. I know “L2” stands for “Layer 2”, but what does “L5 tt” or “L6 bp” stand for)?
2, Role of L6 ip, L6 mp, L6 bp - Do you know of any hypothesis on what these sublayers encode and what they are connected to?
3, Motor - Does anybody know how exactly to understand the “Motor” part of the “L3a” layer output? What (at least roughly) kind of information is conveyed through this pathway?
4, Cortical and Thalamic input - the feedforward information coming to the cortical column is coming from Cortex and Thalamus. When I think about it, what purpose do these two streams have? I.e. is the only purpose of this combined pathway to bring information from one or another source or am I looking at it in the wrong way?
The way I’m thinking about this is following: let’s say we’re talking about some region processing visual information (=V2). V2 has incoming feedforward pathways from V1 (cortex) or directly from thalamus (i.e from retina directly). And let’s say I’m reading some text. If the text is small enough, V1 is able to process what letters I’m looking at, so it makes sense to pass this information to V2 as V2 could process information related to the whole sentence.
But, If the text is huge (let’s say I’m reading something from a PC monitor and I’m sitting very close to the monitor and the font size is huge), then V1 can’t recognize the letters (as the receptive fields are very small) - in which case V2 would rather take the feedforward input from the axons of the thalamus to recognize the text)(With the assumption that V2 spans a greater receptive field of the retina than V1 does). Is this the purpose of combining inputs from cortex and thalamus or does it also serve some other purpose? Or am I completelly off?

Thank you in advance for any clarification/explanation/pointer to something.

1 Like

The six layers of the cortex are what scientists saw over one hundred years ago. It was based of crude visual inspection. We now know there are many unique cell populations not captured by the six layers. It is common to label them as sub-divisions of the six layers. The nomenclature is not consistent. L5tt stands for “thick tufted” and L5cc stands for “cortical cortical”. The letters indicate something about the morphology or connectivity of the cells. L6ip stands for “inverted pyramid” and L6bp for “bi-polar”. Sometimes the same cells will be labeled “L5a” and “L5b”. Most people stopped using that when it was realized that L5a and L5b were reversed in different species.

The cells in layer 6 are least understood, largely because they are most difficult to record from. What I labeled L6a and L6b (what I proposed are like grid cells) do not send axons long distances with the region. They are relatively narrowly focused. The other cell types in L6 are known for long distance projections with the region and between regions. No one knows what they do. We suspect that at least one of them carry motor related information. There is evidence that they do connect between motor and sensory regions.

I don’t know of any evidence that L3 is involved in motor stuff. It is well known that L5tt cells are a (and maybe the only) motor output in every column. In our model so far, we have a need for one layer above L4 to represent the object being sensed. Don’t know yet why there are L2, L3a, and L3b. BTW, not every biologist believes L2 actually exists. The difference between these upper layers can be hard to detect.

You are asking the correct questions. No one understands why there are two FF pathways. The general rule is that when region1 projects to region2 it sends a direct signal from L3 in region1 to L4 in region2. It also sends a signal from L5tt in region1 to the thalamus which then sends it to L4 in region2. The thalamus is doing something to this signal, but what is not clear. One thing it is believed to do is attention. Region2 is able to control the thalamus in a way that attends to only part of the data. This is an area we are working on intensively now. I suggest reading Murray Sherman’s book or some of his papers if you want to dig into it. I believe these are listed on the bibliography page on our website.


Hi Jeff,

perfect, thank you very much for taking time to answer the questions!

I see I’ve made one mistake though - in the 3rd question I’m asking why there is a “motor” component to the L3a layer - which is a mistake - as the picture suggest, the “motor” output is coming from the “L5 tt” cells. Hence, last question from me - what hypothesis is there for what is the “L5 tt” motor output used for / where it’s used?

Thank you.


1 Like

Hi Jeff, could you give us a hint where to find the paper “Bill” refered to about “Mathematically Ideal Brains” in the Q&A

greetings and thanks for all your great talks

Thanks for an amazing talk, Jeff! I also have a few questions:

  1. What is the purpose of the L3? Is this the pooling layer that makes L4 more stable?
  2. Would this be a somewhat correct story? I’m ignoring different regions trying to negotiate the agreement here.
  • We get input to L4 from some sensor.
  • L6a provides the location of that sensor in the object space. (Where does L6a receives that input from? Just a wild guess?)
  • L4 feeds to L3, where a more stable conceptual representation of the signal forms. Does it do temporal pooling to form this stable representation?
  • Then L3 projects to L5 (at this point the signal is stable in terms of - layer know the “thing” it perceives) That get’s combined with L6b to understand what is that “thing” in terms of the broader knowledge. I.e. if it’s a handle of a cup or a handle of something else that is similar to touch.

Is this a remotely correct intuition?

1 Like

What I am interesting in is does this neocortical ability of resolving allocentric geometries include how the brain learns to use the parallax of objects around us that we experience when we move as a method to judge distances between those objects and thus build a better 3 representation of the space and size between the objects.

Thank you for all that you do.


1 Like