SDR capacity formula is just Combination. But it is true if we consider 2 SDRs are different even if they have merely one different bit.
In practice we considers two SDRs are different if they have for example 20% of different bits.
What is capasity formula for such condition?
Count all possible >80% matches for any random SDR then divide the number of unique possible SDRs to that count. (Or half of it, maybe?)
PS: this provides rather an approximate account of non-overlapping neighborhoods, I don’t know if such information, like the one regarding absolute number of possible SDRs for a given size/sparsity, has other meaningful sense besides a rough view on the power of representation you may call it “encoding bandwidth”.
Metrics could be interesting though, e.g for a base SDR size/length = 1000/20 a one bit distance ( from the starting 20) gives 20x980 = 19600 neighbors, a two bit one roughly squares that count (~360M neighbors), and so on. 4 bit distance (80% overlap) something in the order of 100 million billion.
Despite absolute counts of neighborhood size becoming quickly super-astronomical as the bit distance increases, in real world often it isn’t enough for the underlying learning mechanism, since the number of parameters/synapses (compute & memory capacity) in a network is limited by physics/biology/technology.
The ratios between neighborhood sizes vs number of neighborhoods changes with increasing/lowering density might provide other interesting and likely not too useful information about how many “things” can or can not encode a SDR
Yes, see the paper:
How do neurons operate on sparse distributed representations? A
mathematical theory of sparsity, neurons and active dendrites
Subutai Ahmad, Jeff Hawkins (2016)
https://doi.org/10.48550/arXiv.1601.00720
I hope this helps!