Unsupervised learning and storing the location with the feature

The theory covers both learning new objects and recognizing previously learned objects. We ran simulations showing both learning and inference in the paper. However, we didn’t show them happening at the same time. We believe learning and inference do occur simultaneously, but at the time we wrote the paper we didn’t have a good explanation for exactly how. (The same issue existed in our sequence memory.) In the past month we think we have discovered the answer. There is a 6-8 Hz theta rhythm that exists in many parts of the brain, including cortex. There is strong evidence that learning and inference occur at different phases of the theta cycle. This is an elegant solution, we will include this idea in future simulations.

The theory does not require supervision. But the paper didn’t show all the components required as we didn’t know them until very recently.

In the paper, we assumed we knew the location but not the object. We didn’t show how the location was determined, but suggested that the grid cells in entorhinal cortex solved a similar problem and the same mechanism might occur in cortex. We have since made a lot of progress on exactly how the location is determined. Marcus Lewis has been working on this problem. The solution is fascinating and hard to describe in a forum post. Locations are unique to both the object and the location on the object, in the same way that grid cells encode a location that is unique to both the location and the room the animal is in (this is not how we described locations in the paper, we didn’t know a that time that locations were object-specific). The process to recognize the location AND the object is to get a sensory input, form a union of locations that have that sensory feature, move the sensor and predict a union of new possible locations, this union then predicts a set of new possible features, then the next sensory input arrives which only matches some of the predictions, this narrows down the union of locations, move the sensory again, and repeat. Each movement of the sensor and consequent new sensory input reduces the set of possible locations/objects. This solution requires two layers of cells, for example, L4 and L6a. We believe L6a works like grid cells and predicts new locations based on movement.

I apologize if this is hard to understand, it deserves a 20-30 talk to describe. We hope to publish a complete theory of how all this works in the next year.

3 Likes