In HTM school and the paper about properties of SDRs the robustness of SDRs is demonstrated with respect to switching some % of active bits. Why is this the relevant metric? Is there some neurological explanation? Why not chose instead of active bits some percentage of all bits or even a fixed amount of bits?
Pardon. I can’t fully understand your question. Can you express it a bit more?
When I added noise to SDRs in HTM School, I did not just add to the active bits, I added noise to the entire representation:
I understand this. My point is that if we calculate the noise with respect to active bits then this automatically favors the sparse representations. For example if we have in total 2048 bits and sparse representation with 40 active bits then 30% noise means we change state of 12 bits. But if we had 1000 active bits then it means we change 300 bits. It doesn’t make much sense to me to compare these two.
Since the activity is sparse, if you choose 30% all bits for noise, then you are essentially saying to activate ~30% of all the neurons (since most are inactive). I am not a neuroscience myself, but as I understand the biology, there are inhibitory systems in place which enforce sparsity, so your example 30% of all neurons would not be physically able to all activate at once. Since the system enforces some maximum activation density, I believe it makes more sense the way Numenta has defined percentages when talking about noise tolerance.
If you think of the biology a SDR is the synapses (connected bits) as a dendrite snakes along though the rising projecting axons in layer one. (Input field) The entire length of the dendrite is the 2K or so potential connections for that SDR; only a sparse set of the dendrite synapses form connections to the rising axons.
These synapses (connections) to the axons (input field) may be present or missing and the projecting axon may be signalling a positive or negative representation of the input features. (example - A firing axon could be signalling either a light OR dark feature at this location)
There can also be multiple possible dendrite/synapses connections to each rising axon.
When you think of “active” connections in a sparse representation the absence of a connection is actually part of the data representation. It did NOT learn a connection because at this synapse location there is no learned relationship; they did not “fire together and wire together.” That rising axon/synapse location may be signalling something that “other” SDRs have added to their collection of active bits.
Considering all this - the noise/error could be anywhere along this dendrite and it could be either a missing or excess synapse connection.
Thank you very much for your responses. Sorry I couldn’t get to them earlier. I confess I will have to do some homework in biology to fully understand @Bitking’s answer, but from what I gather the reason we define robustness in this way has biological (rather than some mathematical) reasons.
The reason I was asking this was that I wanted to understand why is spatial pooler useful for us in computer implementation (which does not suffer from the same noise issues as the brain does so we do not care so much about the robustness there).
As I imagine it now the most useful part about it is the possibility to group multiple representations (using simple union) together into one and then being able to reconstruct the originals from that (which is obviously not possible with dense representations). That gives us the ability to predict multiple outcomes simultaneously. Does this viewpoint make sense? Do you have alternative?
Computers might not have analog noise, but they can still get the wrong answers. Changing a fraction of the active bits simulates partially wrong data.