In the paper " How do neurons operate on sparse distributed representations? A mathematical theory of sparsity, neurons and active dendrites," figures 5A & B and 6 are directly related to the question of SDR size and reliability of the representation. Both the size of the SDR and the number of bits sampled and the relationship to the accuracy of both positive and false-positive cases are probed.
The accompanying text essentially outlines that the performance goes up as SDR length is increased more than it does for increasing the number of message bits in that SDR. A sweet-spot is described where the error performance was better than 10^9. In the discussion, it was explained that there is a point of diminishing returns where more than a certain number does not add very much to the performance.
Executive summary: very small SDRs will have worse performance. Adding more “on” bits to that small network does not help.
That is the underlying math on the selection of SDR size. There are practical concerns with implementation. As Martin pointed out, the models that are practical are rather limited in scope compared to the number of columns needed to exhibit the theoretical behaviors observed in the wetware. Various band-aids are employed such as “boosting” and “fully connecting” the arrays to get the models to do anything.
None of the models I have looked at are large enough to demonstrate topologic behavior. The is a key in-vivo behavior documented by researchers such as Moser. The topology switch in the models just controls how the cells are interconnected locally. All the models I have looked at have to have topology turned off to work at all - they are fully connected. All cells sample all other cells in the model.
Without being to duplicate and manipulate what the brain does there limited capability to test theories on how the wetware is doing what it does. When Jeff says that we really don’t know how this stuff works he is correct. I have read many papers with subtle variations on how the cells interact in the CC and even tiny changes lead to major shifts in the proposed mechanisms. There is some convergence on possible mechanisms driven by better in-vivo probing and recording technology but having a high fidelity simulation would be very helpful in sorting out what is going on.
There is a threshold below which the networks don’t do very much at all. I don’t know what that threshold is. Small (toy?) networks do “something” but I don’t know if they actually convey any useful understanding of what the brain is doing or how it does it. The hot gym model shows a type of learning but it a very far bridge to get from there to a deeper understanding of how that builds to things like the formation of Gabor filters in the V1 region.
Without the interactions that come from larger models, the investigation of the CC by itself is useless - the large-scale interactions with other cells is missing and you have no way of knowing if it is working. We can tweak how the CC works in these tiny models but that may end up being exactly the wrong thing when the model is increased in size. We just don’t know because we can’t test the ideas.
I expect that as the technology to do the simulations get better the band-aids won’t be needed and the emulation of the wetware will get good enough that we will start to see a better agreement with what has been observed in the wetware. We seem to be very far from that now.