Let’s do a bit of a deep dive on this - the representation of sounds as a time sequence. This leads to more general questions that seems relevant to this conversation.
In programming of computers, we can assign a space for variables, pull them apart and bring them back together as needed for various operations.
In neural tissues, we don’t have this luxury. The various sensory streams come in on the same stream with successive waves laying one atop the next. In vision, we have the overall scene being recognized in an early part of the vision system with directions being tapped off to the frontal eye fields moving the foveal vision to and fro picking up select features of the scene in turn. In the center of VI, we have successive presentation being layered one right on top of the next - perhaps dozens - to build up knowledge of the current environment.
In the hearing, we have combinations of frequencies and phase shifts that are combined into sounds, parts of speech, sentences, up to communications that span several sentences. We are able to parse out meanings that span the entire length of an utterance so simple word combinations are not enough to convey meaning.
I have been thinking about this exact point for many years - looking for possible mechanisms that are biologically plausible and capable of holding this information in a palimpsest.
One possibility is that the stream is parsed into a distributed form of some sort where orthogonal information is spatially separated into regions so that a buffer map can hold more than one piece of information at a time. I envision these islands of meaning as cow spots. William H. Calvin describes them as the Cerebral code.[1]
Where this runs into trouble is that SDRs can only reach so far. The SDR mechanism is strings of synapses along a dendrite. If we had patches of meaning scattered here and there around a map then it would be non-biological to expect dendrites to connect this information; the patches are simply too far apart.
How else can we distribute the information so it is available over some larger space, and still not interfere with other information present at the same time?
If each perception went through a “hashing function” that spread it out into a distributed form it could make many little islands of information that reliably signaled a certain ground truth about what was sensed, and this spread-out form could co-exist with other hashed representations simultaneously. A sparse distributed representation.
A SDR neuron could learn a local combination of these islands of meaning that represents some fact about the sensed object.
Coming back to the entorhinal cortex we have an actual experimentally measured pattern that satisfies the conditions I just set out. We know that the hippocampus does, in fact, pick out place information from whatever this distributed islands of information we call grids is signaling.
When I encountered papers describing grids I saw exactly what I was predicting to solve this theoretical problem in exactly the form I was expecting. This is much too strong an alignment of Marr’s level 1 and level 3 problem spaces to be ignored.
This is the very much the same problem I expect to solve in other areas of the brain and as I have stated several times now - since the cortex is able to use the columns to form this grid pattern here - and the same columns are present in other areas - it is highly suggestive that these other areas are doing the same thing.
If not actual grids than something very much like them.
COMPRESSING THE CEREBRAL CODE - William H. Calvin
http://williamcalvin.com/socns94.html
http://williamcalvin.com/bk9/bk9ch6.htm
The entire book:
http://williamcalvin.com/bk9/