Encoding/Decoding union of SDRs

mraptor · January 26, 2022, 3:37pm

I have built many Encoders that encode/decode single SDR f.e.

   sdr = encoder.encode(55)
   encoder.decode(sdr)

It is easy to write encoder to do this :

   union = encoder.encode([11,22,55])

the problem is :

       encoder.decode(union)

You can do it by enumeration, but is not feasible !
Imagine looping over 1000 SDRs on every decode

how will you develop encoder for unions ?

cezar_t · January 26, 2022, 4:02pm

I can think of:

Nearest Neighbor indexing/search
Associative memory with SDRs

Each has its own costs, but consider yet a more serious issue - assuming problem space become more complex and you use N encoders for different modalities/input streams, and overlaps may contain SDRs encoded by different encoders.

mraptor · January 26, 2022, 4:12pm

u can use something like ths

cezar_t · January 26, 2022, 4:28pm

For what?

Orthogonality has its own undeniable virtues, but

Similarity is good when you want an approximate value in case you fail to decode an exact match

Another problem I have with a naive encoder that just overlaps sdrs for several values is that there is lost any semantic about what these values represent.
e.g.
Length, Depth, Height = 129, 44, 56
SDR = encoder.encode(Length, Depth, Height)
X, Y, Z = encoder.decode(SDR)

How do you know which of the X, Y, Z map to Length?

Except for the case you use different numbers to represent unique IDs, eg the union SDR represents three identities (words, names, pointers) and not three smoothly changing scalar values. Then you prefer 55 to be orthogonal to 54

cezar_t · January 26, 2022, 4:45pm

Either way, I also think the idea to use “SDR triplets” to encode/describe arbitrary relationships between arbitrary pairs-of-things might be very fruitful.

Maybe. I have yet to figure out how are SDRs more appropriate than explicit graph description, other than a warm “sdr is good” feeling

MaxLee · January 26, 2022, 5:38pm

I guess, do you have to? Or for a given column mapping’s potential pool, let it reference those multiple encoding spaces instead of trying to union them.

So:

encodingA
encodingB
encodingC

and column 1’s potential connections would look like:

encA[0…25]
encB[37…90]
encC[7…10]
…

Instead of forcing a union, just randomly distribute connections to the various encoding spaces.

In this scenario, the potential connection pool would just be slices of the encoding’s space. Obviously you’d want to make sure that there’s the minimum amount of copying taking place.

Bitking · January 26, 2022, 6:07pm

This goes to the larger question of feature binding.

I like introducing triplets of (feature relation feature) at multiple levels of representation.

Of course, this requires introducing layers of representation in your model and the whole infrastructure that goes with this.

I will add that DL depends on multiple levels of representation/layers to do much of its magic.

For that matter, so does the brain.

mraptor · January 26, 2022, 7:46pm

I’m asking with reference with my prev. post … where a TM outputs a union of SDRs and Encoder feeds TM directly w/o SP

mraptor · January 26, 2022, 7:58pm

yes thats what i meant… merging different dims complicates the design. i want to first test simpler design.

btw multi dim works too , just permute/shift them

SDR = encoder.encode(Length >> x, Depth >> y, Height >> z)

encoder should know x,y and z

hsgo · January 27, 2022, 5:07pm

By the way, simple enumeration works very well! If you ignore the performance, that is.

The SDR of 11:
The SDR of 22:
The SDR of 55:
The union:
Overlap counts between the all 1000 SDRs and the union(bottom: zoomed to only show up to 99):

image518×578 5.18 KB
The indices of the SDRs sorted by the overlap counts.

This is implemented in NumPy and takes only a few milliseconds.
But I have some ideas to improve this. Stay tuned

hsgo · January 27, 2022, 6:23pm

So this is what I’ve come up with:

You first measure overlaps between the union and clusters of encodings.
- A cluster is made up of the union of randomly chosen encoded SDRs.
  You can think of a cluster as a set of dendritic segments. It encodes separate independent SDRs. (Even a temporal memory cell can represent different contexts, disambiguated through an ensemble of multiple TM cells.)
Now you can test only with the encodings the winner clusters include, achieving linear speed-up. (e.g. 1000 encodings, 50 clusters, 3 winner clusters: 50 + 3 * (1000/50) = 110 resulting overlap tests v.s. 1000.)
- You can come up with some hierarchical way of doing this, enables achieving exponential speed-up essentially. In this case, disambiguation with ensembles like TM uses might be needed to cope with high densities.

Cluster “SDR” examples:

image286×778 90.4 KB
Overlap counts between the union and the clusters:
The result of this method(i.e. success! ):

P.S. Every experiment I did for these posts is in here, enjoy!

mraptor · January 27, 2022, 8:35pm

that is the main goal is to be faster than enumeration.

garry.morrison · January 28, 2022, 3:46am

So, let me jump in here, and mention my long running project the Semantic DB: http://semantic-db.org/
If we call a “thing” a ket, and call a “SDR triplet” a learn rule, then that is exactly the whole point of my project.
In essence a ket is just anything we can give a meaningful text label (the back end converts those text labels to unique integers). And a learn rule is a type of associative memory linking a ket with either another ket, a list or “superposition” of kets, or a sequence of kets/superpositions.

Because the notation is so abstract, it has a lot of different interpretations.
One is that a collection of learn rules is just a description of a graph.
Another is that a “superposition” is almost a SDR. Indeed, SDR’s are subsets of what superpositions can represent. So we can map SDR’s to superpositions, but only some superpositions can be mapped to SDR’s.

But just a description of a graph by itself is kind of boring, so we have a large collections of “operators” that manipulate kets/superpositions/sequences.

Anyway, a good starting point to look into the project further is the website: http://semantic-db.org/

hsgo · January 28, 2022, 7:51am

But isn’t my latter post supposed to remedy that issue, no?

cezar_t · January 28, 2022, 10:30am

@hsgo your jupyter code is quite clean. How long it would need to search e.g. 10k new SDRs in a 60k size previously stored database? I couldn’t figure out what to %%timeit there. This 60000/10000 is the size of MNIST handwritten digits dataset.

I mean “It does it all in a few ms” sounds like is too small to even consider measuring performance, which in most cases of database search isn’t the case.

cezar_t · January 28, 2022, 11:44am

The Semantic DB looks quite interesting. However a paragraph in your github page summarizes well the main issue

But it is not even known how to apply if-then machines to elementary visual tasks such as MNIST digit recognition. That would of course be solved if we could find a good encoder for MNIST digits.

I think @bkaz mentioned here something within the lines that the “symbolic machine” should have a means to generate/derive symbols&relationships starting from pixel level raw data like MNIST images. When children learn to read/write lot of that knowledge is presented in terms of “hooks”, “circles”, “bars” that are connected, moved or distorted in different ways.

And I bet elementary school children, when compared with SOTA ML methods, suck at MNIST accuracy.

Accuracy alone certainly has its relevance but is a very imbalanced benchmark in assessing “intelligence”

cezar_t · January 28, 2022, 11:55am

To be more specific about:

NO. It has to build/search its own encoder, (and most likely decoder too) from any sequence of sensory frames.

By encoder-decoder I mean a way not only to translate raw input into (symbols&relations) but backwards too, in order to “make sense” of previously unseen input, by applying “decoding” to (symbols&relations) to generate hypotheses about what new input shows.

bkaz · January 28, 2022, 12:16pm

It really comes down to implementation of encoding. Symbols is vague concept, anything can be interpreted as one. My definition of symbol is a representation value of which depends on its position in a set of representations. In that sense pixels are not symbols within a frame, interpretation of their intensity doesn’t depend on coordinates. But the meaning of words does depend on their position in a sentence, and value of bits depends on their position in a binary-coded integer.

Then SDR probably doesn’t qualify as a symbol: it works the same regardless of what dendrite segment it is on. Neither do synapses within SDR. My implementation is totally different, anything beyond pixels is a densely encoded (nested) symbol.

EEProf · January 28, 2022, 4:16pm

Thank you!

garry.morrison · January 29, 2022, 1:18am

By encoder-decoder I mean a way not only to translate raw input into (symbols&relations) but backwards too, in order to “make sense” of previously unseen input, by applying “decoding” to (symbols&relations) to generate hypotheses about what new input shows.

Hrmm… do you mean something like this (I’m not sure if you saw this page or not)? Semantic DB
Scroll down to the bottom of that page where there are a couple of examples of the predict-next, and fuzzy-predict-next operators.

The idea behind that page is you have some integer sequences, I chose some simple ones counting, primes, Fibonacci and factorial, and then you learn fragments of those sequences using an integer encoder (I used Gaussians for the encoder). Then after it has learnt all the “if-then machines” you input fragments of an integer sequence and it predicts the next few digits, or fuzzy predicts the next few digits. But it also tells you the name of which sequence it thinks it belongs too.

Let me include the output here. First, the predict-next operator that only returns 100% matches:

sa: predict-next |1 2 3 4 5>
100 %      integer sequence: counting      pattern:     1 2 3 4 5      next-1:     6
100 %      integer sequence: counting      pattern:     1 2 3 4 5      next-2:     6 7
100 %      integer sequence: counting      pattern:     1 2 3 4 5      next-3:     6 7 8
100 %      integer sequence: counting      pattern:     1 2 3 4 5      next-4:     6 7 8 9
4|results>

sa: predict-next |2 3 5 8>
100 %      integer sequence: fibonacci      pattern:     2 3 5 8      next-1:     13
100 %      integer sequence: fibonacci      pattern:     2 3 5 8      next-2:     13 21
100 %      integer sequence: fibonacci      pattern:     2 3 5 8      next-3:     13 21 34
100 %      integer sequence: fibonacci      pattern:     2 3 5 8      next-4:     13 21 34 55
4|results>

sa: predict-next |2 6 24>
100 %      integer sequence: factorial      pattern:     2 6 24      next-1:     120
100 %      integer sequence: factorial      pattern:     2 6 24      next-2:     120 720
100 %      integer sequence: factorial      pattern:     2 6 24      next-3:     120 720 5040
100 %      integer sequence: factorial      pattern:     2 6 24      next-4:     120 720 5040 40320
4|results>

sa: predict-next |2 3 5 7>
100 %      integer sequence: primes      pattern:     2 3 5 7      next-1:     11
100 %      integer sequence: primes      pattern:     2 3 5 7      next-2:     11 13
100 %      integer sequence: primes      pattern:     2 3 5 7      next-3:     11 13 17
100 %      integer sequence: primes      pattern:     2 3 5 7      next-4:     11 13 17 19
4|results>

sa: predict-next |9 9 9>
|Anomaly, no sequence detected ... >

Now the fuzzy-predict next operator, which gives fuzzy predictions:

sa: fuzzy-predict-next |2 3 5 7 11>
100 %      integer sequence: primes      pattern:     2 3 5 7 11      next-1:     13
100 %      integer sequence: primes      pattern:     2 3 5 7 11      next-2:     13 17
100 %      integer sequence: primes      pattern:     2 3 5 7 11      next-3:     13 17 19
100 %      integer sequence: primes      pattern:     2 3 5 7 11      next-4:     13 17 19 23
87.1 %      integer sequence: fibonacci      pattern:     2 3 5 8 13      next-4:     21 34 55 89
87.1 %      integer sequence: fibonacci      pattern:     2 3 5 8 13      next-3:     21 34 55
87.1 %      integer sequence: fibonacci      pattern:     2 3 5 8 13      next-2:     21 34
87.1 %      integer sequence: fibonacci      pattern:     2 3 5 8 13      next-1:     21
72.6 %      integer sequence: counting      pattern:     2 3 4 5 6      next-4:     7 8 9 10
72.6 %      integer sequence: counting      pattern:     2 3 4 5 6      next-3:     7 8 9
72.6 %      integer sequence: counting      pattern:     2 3 4 5 6      next-2:     7 8
72.6 %      integer sequence: counting      pattern:     2 3 4 5 6      next-1:     7
72.4 %      integer sequence: counting      pattern:     3 4 5 6 7      next-4:     8 9 10 11
72.4 %      integer sequence: counting      pattern:     3 4 5 6 7      next-2:     8 9
72.4 %      integer sequence: counting      pattern:     3 4 5 6 7      next-3:     8 9 10
72.4 %      integer sequence: counting      pattern:     3 4 5 6 7      next-1:     8
68.3 %      integer sequence: counting      pattern:     4 5 6 7 8      next-2:     9 10
68.3 %      integer sequence: counting      pattern:     4 5 6 7 8      next-4:     9 10 11 12
68.3 %      integer sequence: counting      pattern:     4 5 6 7 8      next-3:     9 10 11
68.3 %      integer sequence: counting      pattern:     4 5 6 7 8      next-1:     9
63.3 %      integer sequence: fibonacci      pattern:     1 2 3 5 8      next-3:     13 21 34
63.3 %      integer sequence: fibonacci      pattern:     1 2 3 5 8      next-2:     13 21
63.3 %      integer sequence: fibonacci      pattern:     1 2 3 5 8      next-1:     13
63.3 %      integer sequence: fibonacci      pattern:     1 2 3 5 8      next-4:     13 21 34 55
58.5 %      integer sequence: primes      pattern:     3 5 7 11 13      next-2:     17 19
58.5 %      integer sequence: primes      pattern:     3 5 7 11 13      next-1:     17
58.5 %      integer sequence: primes      pattern:     3 5 7 11 13      next-3:     17 19 23
58.5 %      integer sequence: primes      pattern:     3 5 7 11 13      next-4:     17 19 23 29
57.3 %      integer sequence: counting      pattern:     5 6 7 8 9      next-3:     10 11 12
57.3 %      integer sequence: counting      pattern:     5 6 7 8 9      next-4:     10 11 12 13
57.3 %      integer sequence: counting      pattern:     5 6 7 8 9      next-2:     10 11
57.3 %      integer sequence: counting      pattern:     5 6 7 8 9      next-1:     10
55.6 %      integer sequence: counting      pattern:     1 2 3 4 5      next-2:     6 7
55.6 %      integer sequence: counting      pattern:     1 2 3 4 5      next-4:     6 7 8 9
55.6 %      integer sequence: counting      pattern:     1 2 3 4 5      next-1:     6
55.6 %      integer sequence: counting      pattern:     1 2 3 4 5      next-3:     6 7 8
50.7 %      integer sequence: counting      pattern:     6 7 8 9 10      next-2:     11 12
50.7 %      integer sequence: counting      pattern:     6 7 8 9 10      next-3:     11 12 13
50.7 %      integer sequence: counting      pattern:     6 7 8 9 10      next-1:     11
50.7 %      integer sequence: counting      pattern:     6 7 8 9 10      next-4:     11 12 13 14
40|results>

sa: fuzzy-predict-next |9 9 9>
83.5 %      integer sequence: counting      pattern:     8 9 10      next-2:     11 12
83.5 %      integer sequence: counting      pattern:     8 9 10      next-3:     11 12 13
83.5 %      integer sequence: counting      pattern:     8 9 10      next-4:     11 12 13 14
83.5 %      integer sequence: counting      pattern:     8 9 10      next-1:     11
78.5 %      integer sequence: counting      pattern:     9 10 11      next-2:     12 13
78.5 %      integer sequence: counting      pattern:     7 8 9      next-3:     10 11 12
78.5 %      integer sequence: counting      pattern:     9 10 11      next-3:     12 13 14
78.5 %      integer sequence: counting      pattern:     7 8 9      next-2:     10 11
78.5 %      integer sequence: counting      pattern:     9 10 11      next-4:     12 13 14 15
78.5 %      integer sequence: counting      pattern:     9 10 11      next-1:     12
78.5 %      integer sequence: counting      pattern:     7 8 9      next-4:     10 11 12 13
78.5 %      integer sequence: counting      pattern:     7 8 9      next-1:     10
60.3 %      integer sequence: counting      pattern:     10 11 12      next-1:     13
60.3 %      integer sequence: counting      pattern:     6 7 8      next-2:     9 10
60.3 %      integer sequence: counting      pattern:     6 7 8      next-4:     9 10 11 12
60.3 %      integer sequence: counting      pattern:     10 11 12      next-4:     13 14 15 16
60.3 %      integer sequence: counting      pattern:     10 11 12      next-3:     13 14 15
60.3 %      integer sequence: counting      pattern:     10 11 12      next-2:     13 14
60.3 %      integer sequence: counting      pattern:     6 7 8      next-3:     9 10 11
60.3 %      integer sequence: counting      pattern:     6 7 8      next-1:     9
52.3 %      integer sequence: primes      pattern:     7 11 13      next-4:     17 19 23 29
52.3 %      integer sequence: primes      pattern:     5 7 11      next-3:     13 17 19
52.3 %      integer sequence: primes      pattern:     7 11 13      next-3:     17 19 23
52.3 %      integer sequence: primes      pattern:     5 7 11      next-4:     13 17 19 23
52.3 %      integer sequence: primes      pattern:     7 11 13      next-2:     17 19
52.3 %      integer sequence: primes      pattern:     5 7 11      next-2:     13 17
52.3 %      integer sequence: primes      pattern:     5 7 11      next-1:     13
52.3 %      integer sequence: primes      pattern:     7 11 13      next-1:     17
28|results>

One final note, I am a week or two away from an alpha release of a GUI version of the project.
See here (note the most recent code is in the working branch): GitHub - GarryMorrison/SemanticDB4: A graphical user interface for the Semantic DB

And another final thought. I have tried some ideas for MNIST several times. Either SDB is not suitable for MNIST, or I am not smart enough to work out how to apply it!

Topic		Replies	Views
I've found a way to decode SDRs Engineering	18	2144	July 22, 2017
HTM School Episode 4: SDR Sets & Unions YouTube sdrs	13	2148	May 25, 2016
External Similarity Engineering	0	380	October 13, 2021
Questions regarding SDR encoding Engineering	9	3416	September 14, 2016
Weighted encoding semantics Engineering	2	801	February 20, 2019

Encoding/Decoding union of SDRs

Related topics