Variable sparsity SDR's and similarity?

mraptor · May 11, 2021, 3:12pm

The brain does not run on fixed sparsity, it is a good rule of thumb for implementation.
How will you interpret OVERLAP in a context of VARIABLE sparsity.

F.e. if you have vectors (a,b,c) with sparsity (5%,2%,2%) and overlaps :

         a/b = 20
         b/c = 20

which one is closer to b , a or c.

In general how do you interpret different combination of sparsity and overlap ?

hsgo · May 12, 2021, 6:24am

Random high dimensional vectors are pretty much orthogonal most of the time. i.e. their dot product is very close to zero.
Even more so if the vectors are sparse, there’s virtually no chance they’re not (basically) orthogonal.
This fact also translates to random sparse binary vectors, given the high dimensionality and extreme sparsity.
Even if the sparsity varies a little bit, the probability they overlap in a meaningful way is astronomically low.
Now, learned representations are not random. But even if they highly overlap, it means they have very high semantic similarities and probably would have overlapped without varying sparsity.
So I don’t see how it makes any significant difference and I don’t think it matters at all in practice.

CollinsEM · May 13, 2021, 2:30am

I think the primary purpose of fixed sparsity is to prevent the network from giving more weight to any particular feature. By allowing the SDR of one feature to have more active bits, you are effectively giving it more chances to activate other neurons in the network.

Now sparsity is not the same thing as active bit count. For instance, you could have variable sparsity and fixed bit count by adjusting the sizes of the vectors. In that case, the variable sparsity will only affect the probability of random vectors overlapping, as noted by the previous poster.

mraptor · May 13, 2021, 3:06am

You are probably right … I’m thinking in such cases may be Jaccard distance will be better if the sparsity difference is bigger !! a thought

OR better yet Overlap-coef : Similarity in Graphs: Jaccard Versus the Overlap Coefficient | NVIDIA Developer Blog

hsgo · May 13, 2021, 4:10am

Jaccard distance is definitely a better similarity metric in general cases.
But wouldn’t it break the union properties?

mraptor · May 13, 2021, 4:44am

hmm interesting … may be the opposite, because the Jaccard/Ocoef will more fully represent the merge of two vectors…
btw with J/OC you can compare the union vector a|b with c, with overlap you are not comparing but checking for existence, right ? which is subtly semantically different

 J(a|b,c) vs olap(a|b,c) !!!

have to think about it ! what can u do with one that u cant do with the other ?

hsgo · May 13, 2021, 6:17am

The Jaccard index might hurt the union/sub-sampling property that HTM utilizes in regular basis.
Even if the input SDR has every bit active that the reference SDR has, just because it also has many other bits active, the similarity score decreases so that might hurt the union operation of TM/GCM and the sub-sampling operation of dendrite segments.
This overlap coefficient, on the other hand, might not hurt the properties that HTM utilizes.
But I still don’t understand why one would prefer this over just the simple bit-AND&count operation. The reference SDR would pretty much always have hardly varying sparsity and much fewer bits active than the input SDR, so the denominator of the overlap coefficient would not matter, which would make this operation virtually equal to bit-AND&count(the cardinality of the intersect).

mraptor · May 13, 2021, 2:30pm

I dont prefer overlap over jaccard, just thinking if there could be some advantage .

I’m thinking of a different task more along the lines of distinguishing/organizing many/millions vectors with variable sparsity say from 20/100_000 to 200/100_000 (not exactly SDR, may be not brain related ).

with pure overlap i think will be hard to find clumps of vectors … with Jaccard you use all the bits ! So you can say have targeted merge to compress/unionize selectively

As we know every bit has semantic meaning overlap only account for similarities, but if say A have the same overlap with B and C, but B has more bits than C … it would seem that A should be more similar to C, because it has less missing features !

Something like long term memory, where such variability is possible !

Topic		Replies	Views
Sparse Distributed Representations Numenta Theory sdrs , nupic-wiki	0	4338	April 6, 2017
Segmented SDR properties? Numenta Theory	9	524	April 22, 2020
What is the difference between "Correlation" and "Overlap" in SDRs? Numenta Theory	3	3137	May 19, 2016
Dendrite pattern detection rule? Numenta Theory	10	531	August 30, 2021
Sparse Distributed Representations Quiz Numenta Theory sdrs , quiz , nupic-wiki	1	1390	April 10, 2017

Variable sparsity SDR's and similarity?

Related topics