I have a connudrum … everything started when I went to implement TM again in a different manner (via segmented memory : SM, my invention ).
The good thing of SM is that I’m saving alot of memory which also means faster. At the same time it very much resembles group of Numenta-schematic-neurons.
I was surprised I can use SM as classifier out of the box
Then I can slap 3 2D buffers and via controlled flow of SDRs I get a TM.
What is more to ask ! BUT … making it work requires an algorithm for correct RESIZING of SDRs…
Initially I thought easy peasy … look Resize SDR , but as I tested more, simple-Resizing only works on simple two way resize
i.e. A <==> B, for Classifier SM that is OK (Data <=> Segmented-Memory)
Not so for TM, which requires 3-way resizing :
Data <=> 2D Buffer <=> Segmented-Memory
(Buffer is used to mimic mini-columns )
Possible configuration in bits/sparsity (easy mapping : no change in the sparse-bits : 40) :
2000/40 <=> 10000/40 <=> 500/40
/even this simple way is not easy … when u downscale N-bits often map to single bit. Later how do you decide on the upscale how this bit maps to multiple bits !!!/
Complex :
2000/40 <=> 10000/40 <=> 500/20
/both size & sparse bits change : even worse now u for certain have many <=> one bit mapping/
In this scenario simple-resizing doesnt work. I tried/thought 3-4 different permutations (probalistic up/down scaling, memoization …etc). Looked at PCA and Multidimentional scaling (no upscaling).
Then another light bulb moment I realized what I need is 2 SpatialPooler’s. The problem with SP is that it needs
TRAINING and is very complex and may not work.
To add another level of complexity I have switched from binary representation (01001010…) to indexed-SDR (size,[b21,b32,b453,…]),
so I have both of them now … happy&sad. (Indexed should be faster because uses less memory on small sparsity, and later I can speed it up with numba or CUDA, not so with banary where I have to do full-memory search).
So the question is :
How do I 3WAY-RESIZE SDR by preserving similarity when mapping across a middle layer, without the need to do preliminary TRAINING ?
Easy case, ratio 4/2 : A:2000/30 => B:4000/30 => C:2000/30 :: Must : A == C
In [408]: fb(2000,4000,30)
50:62:68:80:149:200:389:412:486:508:640:648:729:902:1102:1142:1215:1285:1553:1583:1679:1738:1764:1782:1785:1830:1837:1937:1959:1994
100:124:136:160:298:400:778:824:972:1016:1280:1296:1458:1804:2204:2284:2430:2570:3106:3166:3358:3476:3528:3564:3570:3660:3674:3874:3918:3988
50:62:68:80:149:200:389:412:486:508:640:648:729:902:1102:1142:1215:1285:1553:1583:1679:1738:1764:1782:1785:1830:1837:1937:1959:1994
Similarity : 1.0
Easy case, ratio 5/2 : A:2000/30 => B:5000/30 => C:2000/30 :: Must : A == C, but does not
In [409]: fb(2000,5000,30)
35:62:79:150:187:307:404:418:451:655:667:717:783:943:975:1059:1149:1153:1199:1217:1228:1248:1353:1396:1495:1512:1690:1698:1768:1824
87:155:197:375:467:767:1010:1045:1127:1637:1667:1792:1957:2357:2437:2647:2872:2882:2997:3042:3070:3120:3382:3490:3737:3780:4225:4245:4420:4560
34:62:78:150:186:306:404:418:450:654:666:716:782:942:974:1058:1148:1152:1198:1216:1228:1248:1352:1396:1494:1512:1690:1698:1768:1824
Similarity : 0.4
def fb(size, new_size, spaOnbits, new_spaOnbits=None):
r = isdp.rand(size=size, spaOnbits=spaOnbits, ret='iSDP')
print r
a = isdp.change_size(r , size=size, new_size=new_size)
print a
b = isdp.change_size(a , size=new_size, new_size=size)
print b
print "Similarity : %s" % (r // b)
Yet those are the cases where the ON bits number stays the same! I only change the virtual size of the SDR and recalc the bit-idxs.
Possible not good solution is to measure similarity not by overlap , but by probabilistic-overlap where nearby bits are overlap-fractions (0.5, 0.23, 0.1…) and overlap bit match is 1.0