One of the things I took from your latest video is that both layers represent location of objects. In the example of the coffee cup with the Numenta logo on it:
(A)
L6a represents the location of the Numenta’s logo features (sub-objects: shapes, color, etc.) on the cup. Cup-centric
L6b represents the location of the cup on a higher space. higher-space-centric!?
(B)
L6a represents the location of the Numenta’s logo features on the Numenta logo. Logo-centric
L6b represents the location of the Numenta logo on the cup. Cup-centric
Which one of these two is the correct interpretation? If it’s (A) then what is this higher space? The room? A convergence of all information about this particular cup?
Nick,
The short answer is we don’t yet know the correct interpretation of L6a and L6b. Here is what I am pretty sure of. L6a cells (the ones that project both to L4 and back to the thalamus) represent location in the space of the currently attended object. If I am attending to the logo then these cells represent locations in the space of the logo. If I am attending to the overall cup then they represent locations in the space of the cup. I am sitting at a table right now and as I look around the table I see a book, a mouse, a cup, my computer, these words on my display, etc. As I attend to these various objects, the same cells in L6a in region V1 are changing to represent locations in different spaces. It happens rapidly.
We are also confident that L5tt cells represent a displacement, or transform, between two spaces. This is an important idea that explains a lot of things.
After this, our confidence on the details starts to get murky.
What about L6b. At first we thought that L6b cells represented a larger space, such as the room or table. In this scenario L6a and L6b together would represent two spaces and we could learn the relative displacement of objects to the table in L5tt. This might still be true.
However, there is another possibility. Say I am looking at a book on my table and L6a cells represent a location relative to the book. I then move my eyes to look at my computer mouse. The movement will update L6a to represent a new location relative to the book (albeit on the mouse and not on the book itself). But then the mouse will be recognized and L6a switches to locations relative to the mouse. So L6a quickly switches from a location in book space to the equivalent location in mouse space. These two states can be used to determine the location of the mouse relative to the book in L5tt. In this scenario we don’t need L6b at all.
So in scenario 1 L5tt cells represent the displacement of attended objects relative to a larger composite space and in scenario 2 L5tt cells represent the displacement of objects that were attended to in succession. In both cases we believe L5tt represents a union of displacements for the set of recently attended objects.
There are numerous pros and cons to both of these scenarios. We don’t have enough data yet to know exactly what is going on.
BTW, anatomical data on L6b is not as rich as L6a. Some authors don’t make the distinction between 6a and 6b and some do.
I hope this helps. We are working hard to resolve these questions, but it takes time.
Jeff
In this slide from an older video you assign L6a the purpose of mapping orientation to a cortical column.
I understand recognition through touch is easier than recognition through vision but how do you incorporate head orientation correctly in visual recognition?
In the example of the coffee cup with the numenta logo on it, when the attended object is the numenta logo:
Does L6a represent both the location of features of the logo and the orientation of the head?
Again, these are details for which we don’t yet have definitive answers. Here are some things to consider.
Layer 6a projects to L4. If L6a is sufficient to allow L4 to predict its input then L6a must encode both location and orientation, because you need to know the orientation to make an accurate prediction.
In Entorhinal Cortex there are Head Direction cells, which are a form of orientation, and there are grid cells, which encode location.
Most of the cells in EC are a combination of both HD and grid cells, referred to as “conjunctive” cells.
It is likely that L6a is some form of conjunctive (location plus orientation). That is what is needed to predict the input.
In the slide above I labeled L6a as orientation and L6b as location. That was my best guess at the time. I hope I said so in my talk. But, as I said above, it is more likely that L6a are conjunctive cells. We are currently working on these issues. Hopefully we can make some progress soon.