Here I want to discuss a problem in which the main goal is to find a set of dendrites to optimally cover a data set of high-dimensional SDRs.
“high” means something like tens of thousands or more bits instead of hundreds or a couple thousands as the input pattern series are sized in HTM.
Why? because
- the search space in HTM-like tools is performance capped by the size/complexity of input.
- the eager strategy on tapping on immediate available potential patterns within a SDR might not be optimal.
What does “optimal” means? There are two types of metrics to consider, one is the individual dendrite level and the other one is global, population-of dendrites level.
- a dendrite’s sharpness - it is considered to be “sharp” when its all synapse inputs are either active or inactive. And “confused” when only some of its synapses are active. We can call it “consistency” - the dendrite is optimized for a clear micro-pattern (or feature) in the large SDR.
- at population level we have completeness or coverage criteria - a complete coverage is when all 1 bits within an any given input SDR are covered by signalling dendrites. Which means “no bits left unaccounted” in input SDRs
- also at population level we have two minimalist criteria,
- at dataset level to have a minimum and sufficient number of dendrites necessary to cover any input SDR in the dataset.
- minimize redundancy on any individual SDR which means again have as few as possible active dendrites.
As you see the optimising for all three criteria above could be a tricky problem.
This seems related to the tiling problem in which dendrite is a tile. The differences are:
- we can pick any “shape” for a “tile”,
- there is no limited number of tiles BUT we want to find those particular shapes that allow for a minimum number of tiles.
- the are many “spaces” to be covered - each input SDR On bits form a different space.
Further motivation:
Biological intelligence manages to somehow not only figure out a way to “recognize” a “large” pattern but to also home in to a minimum set of micro-patterns that are necessary and sufficient (== representative) for the larger pattern.
The highly praised “few shot learning” animals are allegedly capable of might involve more than just quickly adding/removing some synapses as in “here-s a paper then voila I understood (e.g.) Newtonian gravity”. It takes time to dig into finding the significant correlations needed to represent then understand any given problem or spatio-temporal context.
There are reasons to believe the high number of (mini)columns are needed not only to record and recognize all patterns we encounter, but also for large scale data mining needed discover a minimalist set of relevant ones.