In current HTM theory, there are spatial and temporal poolers. In this post, I propose the introduction of functional poolers, and describe how they lead to the emergence of meaning (full paper).
According to HTM theory, basal dendrites represent expectations (about what element of the sequence will be perceived next) and have the function of contextualizing the feedforward input to the region on the temporal dimension - in other words, as an element of a sequence.
Similarly, I believe, apical dendrites, while they represent top-down expectations, have the function of contextualizing the feedforward input to the region on a non-temporal dimension - in other words, adding meaning to it.
Note: in this post I do not mention at all grid cells. It’s because the process described here is fully compatible with them, but does not require their presence to be explained.
How does a functional pooler works?
A functional pooler works exactly like a temporal pooler, except that neurons do not enter a predictive mode based on the activation status of neurons in the same region (detected through their basal dendrites), but based on the signals received on their apical dendrites coming from other parts of the cortex. Such information is used as “context”. Those not familiar with HTM theory might read a full description of a functional pooler in the paper linked at the top of this post.
In other words, the process is as follows:
- Information enters the region from its receptive field.
- The spatial pooler “selects” which columns to activate. The pattern of columns which activates represents the features of whatever object was observed.
- The temporal pooler “selects” some of the neurons in the active columns which will fire. The patterns of neurons which fire represents the position in the sequence occupied by the features recognized during the previous step.
- The functional pooler “selects” some of the neurons in the active columns which will fire. The patterns of neurons which fire represent “contextualized features”.
It is important to understand that those contextualized features represent all associations between a pattern observed in the receptive field of the column (the content) and a pattern observed somewhere else in the cortex (the context). All associations lead to activation of neurons, regardless of relevancy. The two next steps will take care of weeding out the irrelevant ones. - The neurons of the region become part of the receptive field of the next region in the hierarchy. Consequently, the spatial pooler of the next region tries to recognize patterns of activation in all the neurons of the region of steps 1-4: in other words, it tries to recognize patterns of contextualized features.
- The neurons of the first region (the one representing contextualized features) which were active and consistently led to the activation of a column in the second region will form strong synapses, whereas the neurons of the first region which were active but did not consistently led to the activation of columns in the second region will not form strong synapses.
The consequence is that only relevant contextualized features will form synapses, and will be “taken into account” in future pattern recognitions.
I acknowledge that the 6 steps described above are not enough to extensively explain the process; at the top of this post there is a link to a paper with an extensive description.
The emergence of meaning
As information passes from region to region, it undergoes an alternation of pattern recognitions (performed by spatial poolers) and of contextualizations (performed by temporal and functional poolers).
Spatial poolers compress information. After all, they use a smaller number of bits (the number of columns) to represent information which was encoded with a large number of bits (the number of neurons in the receptive field).
Temporal and functional contextualization expand information. After all, they use a big number of bits (the number of neurons in the region) to represent information which was encoded with a small number of bits (the number of columns in the same region).
As information passes from region to region, it undergoes an alternation of expansions and compressions. The expansions provide additional information which tentatively expands the meaning of the information being feed-forwarded. The compressions take care of removing the additional information which was coincidental but not relevant. How they exactly do so is described in the next section. The information that gets retained - the one that forms patterns recognized by the next region - is meaning
In other words, as information passes from region to region, an expansion followed by a compression allows for meaning to get added - to emerge.
How is relevance of context determined?
Before information enters a region, it is encoded by a large number of neurons, forming the receptive field. When it enters the region, as it gets processed by the spatial pooler, the information gets represented by a much lower number of items: the columns. As less bits are available to encode information, a compression takes place.
After that the spatial pooler “decides” which columns to activate, the functional pooler and the spatial pooler “decide” which neurons in those columns activate. Because the number of bits increases (the number of neurons in a region is larger than the number of columns) an expansion takes place.
The dimension along which the compression takes place depends on the dimension(s) used by the previous region(s) to encode information. For example, information coming from regions receiving input from the retina is represented on visual dimension(s).
The dimensions along which expansion takes place depend on the dimension(s) used.
From the paragraphs above, it emerges that a single region cannot represent the full process. A full iteration of information processing consists in a spatial pooler followed by a functional pooler (and eventually a temporal one) followed by a spatial one, with the focus on the second spatial pooler, whose importance is in neglecting the neurons activated in the previous step which represent irrelevant information; in other words, letting only signal pass.