This assumption isn’t quite true.
There are two states to every pairing of input bit to column. Firstly, each column has multiple input bits in its “pool”, meaning that it has designated input bits to which it can become “connected” - but they are not yet connected. Also, every input bit in a column’s pool has the ability to have its synapse permanence incremented whether it is connected or selected.
Of all the input bits whose permanence exceeds a “connected threshold”, only one of those connected bits are selected.
Each column here is connected to one unique bit in the input space. In this case this specific SP is not going to be able to represent a lot of different input values.
In this specific hypothetical case (remember most often there could be multiple synapses whose permanence exceeds the “connected threshold”), synapses belonging to other bits can have their permanences incremented above the threshold. And in fact, there is a boosting mechanism external to this “fairness” policy, that when employed (some debate over its idiosyncrasies is being spurred currently), greatly increases the likelihood of other synapses which aren’t currently connected - becoming connected.
If this SP is trained and all potential synapses have formed a synapse with the associated input bit this SP is not going to be able to distinguish between different inputs.
The SP is an “online” learning algorithm which means (due to decrementing of connected synapses whose input bits are “zero”), that it will (over time) adapt to an entirely different space of input bits from potentially different problem domains.
Now… when encountering a totally different space of inputs, the SP will need cycles in order to “adapt” and the columns which initially are selected to represent the new input, won’t necessarily be correct for a period of time following the input domain change. By “correct”, I mean that the SP’s job is to allow similar columns to represent similar inputs such that resulting SDRs will contain columnar selections proportional to the overall similarity of the input - relative to the similarity that each member of that set of inputs has with each other. See Raul’s explanation
Of course this number can be increased a lot if we connect the 2048 columns to a larger input space, we allow overlapping potential synapses and we take multiple potential synapses connecting to this input space.
I was wondering by how much this number will increase. Therefore the question, when in the spacial pooler full?
Remember there are a number of different “criteria” used to decide which “active” columns will represent a given input:
- Each column has synapses to a random subset of 50% of the input bits. (Its pool)
- The same input bits are very likely to appear in several columns input pool.
- An input bit must be above a connection threshold in order to become “eligible” for the next selection process.
- Only 2% of the connected columns are selected to become active. This can occur either as the highest connected columns globally or the highest connected columns within a given inhibition radius.
- All synapses that are connected but whose input bits are zero are decremented. All synapses whose input bits are “1” are incremented.
Then there is the eventual inclusion of @fergalbyrne 's paCLA idea, which is to further reward or punish those SP/Columnal-synapses which arise from predicted inputs or unpredicted inputs; to allow the TM to influence the Spatial distribution. This enhancement to the theory is deemed significant enough to consider as an addition to the algorithm, but has not been thoroughly examined by Numenta to date, and has only been been partially implemented in comportex and htm.java
Therefore the question, when in the spacial pooler full?
I suppose theoretically it is possible for the SP’s compression of the entire input domain to become “saturated” meaning that if encountering an input set which varies widely enough; resulting SDRs could eventually share the same columns (erroneously) due to the lossy compression?
I think that the robustness reported by the paper is addressing this possibility too? I’m not sure, maybe not? I would like this confirmed also?