The percentage of active column in an HTM model, is usually told to be %2 due to biology… but how much of a holy word it is, from the mathematical point of view ?
What are the probable consequences if raise it to %5 or %10 ?
I would say activations would overlap more and noise robustness would decrease. It would also lead to lesser representational capacity if hierarchy / temporal pooling is involved (union of activations). On the other hand, it would probably yield faster initial learning. %2 is just a good default value and HTM would not break if you set it to %5 or %10 for most tasks. Learning on rich datasets would suffer though. For example, the system would produce more false positives since separate activations start to overlap more.
Thank you…
and is there a correlation between SDR size and the optimal activation percentage ?
like saying if i’m using a smaller SDR, the optimal activation percentage would be higher/lower ?
There’s an effect on generalization here. More sparsity implies fewer false positives, but it’s always the case that that comes with more false negatives.
Yes, there is a tradeoff between sparsity and the size of the vector. If you have very high dimensional vectors, you can tolerate a pretty wide range of sparsities - it doesn’t matter too much. With lower dimensions, the range is much tighter. With really low dimensions, you might be in trouble.
The equations governing that are given here (see also Fig 4A or 5 for example):