I had a question a month ago … the math didn’t seem to work for me !!! for some reason .
Here is copy of the email :
hi,
(code and paper below)
Trying to do the SDR-Union calculations… (sections G. and H. of the paper)
But there is something fishy, when i try to “unionize” more than 50 vectors the number of 1s according to the formulas goes
above 50% and as consequence it tries to calculate nCr(729,1261) i.e. in N choose R , R > N !
What I’m doing wrong ?!
In [122]: fp_union(n=2000,w=40,b=10,M=20)
p1> 0.332392028245
wx> 664.78405649
n,r: 664, 10
n,r: 1336, 654
n,r: 2000, 664
Out[122]: 4.7829847875715666e-129
In [123]: fp_union(n=2000,w=40,b=10,M=50)
p1> 0.635830319913
wx> 1271.66063983
n,r: 1271, 10
n,r: 729, 1261
...
AssertionError: nCr : n < r !
from __future__ import division
import operator as op
def nCr(n,r):
print "n,r: %s, %s" % (n,r)
assert n >= r, "nCr : n < r !"
r = min(r,n-r)
if r == 0 : return 1
numerator = reduce(op.mul, xrange(n,n-r,-1))
denominator = reduce(op.mul, xrange(1, r+1) )
return numerator//denominator
#count of vectors with w-bits ON that have b-bits overlap
def overlap_set(n,w,b): #nCr(n,w) are all possible combinations
return nCr(w,b) * nCr(n-w , w-b)
#inexact match : false positive overlap set
# probability of random vector to be false positive
def fp_oset(n,w,b): return overlap_set(n,w,b) / nCr(n,w)
#exact mathch : probability of bit being 0 (zero) after M additions
def fp_union_p0(n,w,M) :
p0 = ( 1 - (w/float(n)) )**M
print "p1> %s" % (1 - p0)
return p0
#inexact match: M number of stored patterns
def fp_union(n,w,b,M):
wx = n * (1 - fp_union_p0(n,w,M))
print "wx> %s" % wx
# wx = w
return fp_oset(n, int(wx), b)