1000 Brains Theory Q&A

You cover a lot of topics in this post. I will try to answer several of them. Let me start with what we mean by the “sensor fusion problem.” We humans have a unified perception of the world. This is derived from a large number of inputs from different sensory modalities. When the inputs arrive in the neocortex they are expanded. The million fibers from the retina become tens of millions of fibers leaving V1. V2 is even bigger. The concept of an “image” is immediately lost, as the representations in V1 and V2 are highly distorted and pieces are missing. The inputs to the brain are also constantly changing, yet we are not aware of most of these changes. The sensor fusion problem is how do these disparate, distorted, and changing features get fused into a single stable perception.

If we think about touch the problem seems harder. There is no “image” on your skin, just patches of sensations that are sensing different parts of objects over time, yet you still perceive a stable and complete object as you touch it. How are somatic inputs fused into a single and stable percept?

Sensor fusion also has to work across modalities. Often, as in your example, we perceive an object based on partial input from one sense and partial input from another sense, neither of which might be sufficient.

The standard belief is that sensory regions such as V1 and V2 are extracting features and that somehow these features are combined in higher regions of the cortex to create a single and stable perception. You say you can see how this happens in association areas, but no one has any idea how real neurons in the brain do this. You might argue that artificial neural networks work this way, but that is a red herring. ANNs require over one hundred hierarchical levels, assume each level is uniform (convolution) and have no concept of integration and stability over time. I am not aware of any ANN solution for touch.

The thousand brains theory (TBT) proposes a distributed sensor fusion solution. Instead of there being one model of each object in the world (presumably in some region high up the hierarchy that combines all sensors), there are many models. Each model is based on whatever inputs it is receiving. The “fusion” occurs via long-range connections in the neocortex. The vast majority of observed long range connections are not hierarchical and can’t be explained in the hierarchical feature extraction paradigm. In the TBT, models are in essence “voting” on object identity. Each model may have uncertainty but together they reach the correct infernce. For example, say your visual models can’t tell if the animal they are seeing in the bushes is a cat or dog, and the your auditory models can’t tell if what they are hearing is a cat or bird. Through the long range connections they will all quickly settle on “cat” as the object. We showed simulations, the model, and code in our 2017 columns paper.

This is an ongoing debate. There are numerous papers on it. I was at a conference a month ago where this was a constantly discussed topic for 2.5 days. I would say that the vast majority of neuroscientist feel that there is a common algorithm throughout the neocortex and that the variations that are observed are just that, variations on a theme. There is a huge amount of empirical data supporting this idea.

The TBT does not propose each column learns models of every possible object. We explicitly state this in the frameworks paper. What we say is that there are many models of each object. For any object there might be multiple models based on visual input, multiple models based on somatic input, and multiple models based on auditory input. Each model is different because it is based on different inputs. But if different columns in the neocortex are modeling the same object, then long range connections between them make sense.

The TBT also does not propose there is no hierarchy in the neocortex. We state this in the frameworks paper too. The big difference is that in the old way of thinking, features are passed up the hierarchy. The TBT states that complete objects are passed up the hierarchy if possible.

5 Likes