What false positive rate is the threshold from good to bad?

Normally in the papers there are error rates i.e. false positive, (f.e. when you check random SDR y against union u, there is the chance of false positive match) … my question where is the threshold where it becomes bad error rate … at 10^-5, or at 10^-4, or at 10^-3, or at 10^-2 … ??
I.e. below what value is considered good false positive error rate.


For instance, consider SDR parameters 푛=1024and 푤=2. Storing 푀=20vectors, the chance of a false positive is about one in 680. However, if 푤is increased to 20, the chance drops dramatically to about one in 5.5 billion. This is a remarkable feature of the union property. In fact, if increasing 푀to 40, the chance of an error is still better than 10^-5

1 Like

The answer may depend on the exact problem you want to solve. We could also use guidance from neuroscience and ask “what is the false positive rate given the sparsity observed in the brain, and the number of active synapses that are required to detect a pattern”. In Figure 6 of @subutai 's SDR paper. The boundary is drawn below 10^-8, corresponding to a spike threshold of 9 synapses, which is typical for dendritic spikes. That should be good enough for a wide range of applications.

3 Likes

The one thing I would add to @ycui’s comment is that thresholds of 9 and up work well, but this also assumes the underlying SDR dimensionality is high (e.g. 2048) and the number of bits on at a time is > 20 or so. We typically use a minimum of 40 bits ON out of 2048, and to be safer we use thresholds around 15. There is a very close correspondence with these numbers and biology.

1 Like