Encoding/Decoding union of SDRs

I have built many Encoders that encode/decode single SDR f.e.

   sdr = encoder.encode(55)
   encoder.decode(sdr)

It is easy to write encoder to do this :

   union = encoder.encode([11,22,55])

the problem is :

       encoder.decode(union)

You can do it by enumeration, but is not feasible !
Imagine looping over 1000 SDRs on every decode

how will you develop encoder for unions ?

I can think of:

  • Nearest Neighbor indexing/search
  • Associative memory with SDRs

Each has its own costs, but consider yet a more serious issue - assuming problem space become more complex and you use N encoders for different modalities/input streams, and overlaps may contain SDRs encoded by different encoders.

u can use something like ths

1 Like

For what?

Orthogonality has its own undeniable virtues, but

Similarity is good when you want an approximate value in case you fail to decode an exact match


Another problem I have with a naive encoder that just overlaps sdrs for several values is that there is lost any semantic about what these values represent.
e.g.
Length, Depth, Height = 129, 44, 56
SDR = encoder.encode(Length, Depth, Height)
X, Y, Z = encoder.decode(SDR)

How do you know which of the X, Y, Z map to Length?


Except for the case you use different numbers to represent unique IDs, eg the union SDR represents three identities (words, names, pointers) and not three smoothly changing scalar values. Then you prefer 55 to be orthogonal to 54

Either way, I also think the idea to use “SDR triplets” to encode/describe arbitrary relationships between arbitrary pairs-of-things might be very fruitful.

Maybe. I have yet to figure out how are SDRs more appropriate than explicit graph description, other than a warm “sdr is good” feeling

1 Like

I guess, do you have to? Or for a given column mapping’s potential pool, let it reference those multiple encoding spaces instead of trying to union them.

So:

  • encodingA
  • encodingB
  • encodingC

and column 1’s potential connections would look like:

  • encA[0…25]
  • encB[37…90]
  • encC[7…10]
    …

Instead of forcing a union, just randomly distribute connections to the various encoding spaces.

In this scenario, the potential connection pool would just be slices of the encoding’s space. Obviously you’d want to make sure that there’s the minimum amount of copying taking place.

This goes to the larger question of feature binding.

I like introducing triplets of (feature relation feature) at multiple levels of representation.

Of course, this requires introducing layers of representation in your model and the whole infrastructure that goes with this.

I will add that DL depends on multiple levels of representation/layers to do much of its magic.

For that matter, so does the brain.

1 Like

I’m asking with reference with my prev. post … where a TM outputs a union of SDRs and Encoder feeds TM directly w/o SP

yes thats what i meant… merging different dims complicates the design. i want to first test simpler design.

btw multi dim works too , just permute/shift them

SDR = encoder.encode(Length >> x, Depth >> y, Height >> z)

encoder should know x,y and z

By the way, simple enumeration works very well! If you ignore the performance, that is.

  • The SDR of 11:
    image
  • The SDR of 22:
    image
  • The SDR of 55:
    image
  • The union:
    image
  • Overlap counts between the all 1000 SDRs and the union(bottom: zoomed to only show up to 99):
  • The indices of the SDRs sorted by the overlap counts.
    image
    This is implemented in NumPy and takes only a few milliseconds.
    But I have some ideas to improve this. Stay tuned :wink:
2 Likes

So this is what I’ve come up with:

  • You first measure overlaps between the union and clusters of encodings.
    • A cluster is made up of the union of randomly chosen encoded SDRs.
      You can think of a cluster as a set of dendritic segments. It encodes separate independent SDRs. (Even a temporal memory cell can represent different contexts, disambiguated through an ensemble of multiple TM cells.)
  • Now you can test only with the encodings the winner clusters include, achieving linear speed-up. (e.g. 1000 encodings, 50 clusters, 3 winner clusters: 50 + 3 * (1000/50) = 110 resulting overlap tests v.s. 1000.)
    • You can come up with some hierarchical way of doing this, enables achieving exponential speed-up essentially. In this case, disambiguation with ensembles like TM uses might be needed to cope with high densities. :wink:

  • Cluster “SDR” examples:

  • Overlap counts between the union and the clusters:
    image

  • The result of this method(i.e. success! :tada:):
    image

P.S. Every experiment I did for these posts is in here, enjoy!

4 Likes

:wink: that is the main goal is to be faster than enumeration.

So, let me jump in here, and mention my long running project the Semantic DB: http://semantic-db.org/
If we call a “thing” a ket, and call a “SDR triplet” a learn rule, then that is exactly the whole point of my project.
In essence a ket is just anything we can give a meaningful text label (the back end converts those text labels to unique integers). And a learn rule is a type of associative memory linking a ket with either another ket, a list or “superposition” of kets, or a sequence of kets/superpositions.

We denote a ket with |some text label>.
So, to associate an age with Fred, a list of friends of Fred, or the spelling of Fred we use the following “learn rules”:
age |Fred> => |41>
friends |Fred> => |Sam> + |Eric> + |Liz> + |Emma> + |Mary>
spelling |Fred> => |F> . |r> . |e> . |d>

Because the notation is so abstract, it has a lot of different interpretations.
One is that a collection of learn rules is just a description of a graph.
Another is that a “superposition” is almost a SDR. Indeed, SDR’s are subsets of what superpositions can represent. So we can map SDR’s to superpositions, but only some superpositions can be mapped to SDR’s.

But just a description of a graph by itself is kind of boring, so we have a large collections of “operators” that manipulate kets/superpositions/sequences.

Anyway, a good starting point to look into the project further is the website: http://semantic-db.org/

5 Likes

But isn’t my latter post supposed to remedy that issue, no?

@hsgo your jupyter code is quite clean. How long it would need to search e.g. 10k new SDRs in a 60k size previously stored database? I couldn’t figure out what to %%timeit there. This 60000/10000 is the size of MNIST handwritten digits dataset.

I mean “It does it all in a few ms” sounds like is too small to even consider measuring performance, which in most cases of database search isn’t the case.

The Semantic DB looks quite interesting. However a paragraph in your github page summarizes well the main issue

But it is not even known how to apply if-then machines to elementary visual tasks such as MNIST digit recognition. That would of course be solved if we could find a good encoder for MNIST digits.

I think @bkaz mentioned here something within the lines that the “symbolic machine” should have a means to generate/derive symbols&relationships starting from pixel level raw data like MNIST images. When children learn to read/write lot of that knowledge is presented in terms of “hooks”, “circles”, “bars” that are connected, moved or distorted in different ways.

And I bet elementary school children, when compared with SOTA ML methods, suck at MNIST accuracy.

Accuracy alone certainly has its relevance but is a very imbalanced benchmark in assessing “intelligence”

1 Like

To be more specific about:

NO. It has to build/search its own encoder, (and most likely decoder too) from any sequence of sensory frames.

By encoder-decoder I mean a way not only to translate raw input into (symbols&relations) but backwards too, in order to “make sense” of previously unseen input, by applying “decoding” to (symbols&relations) to generate hypotheses about what new input shows.

It really comes down to implementation of encoding. Symbols is vague concept, anything can be interpreted as one. My definition of symbol is a representation value of which depends on its position in a set of representations. In that sense pixels are not symbols within a frame, interpretation of their intensity doesn’t depend on coordinates. But the meaning of words does depend on their position in a sentence, and value of bits depends on their position in a binary-coded integer.

Then SDR probably doesn’t qualify as a symbol: it works the same regardless of what dendrite segment it is on. Neither do synapses within SDR. My implementation is totally different, anything beyond pixels is a densely encoded (nested) symbol.

Thank you!

1 Like

By encoder-decoder I mean a way not only to translate raw input into (symbols&relations) but backwards too, in order to “make sense” of previously unseen input, by applying “decoding” to (symbols&relations) to generate hypotheses about what new input shows.

Hrmm… do you mean something like this (I’m not sure if you saw this page or not)? Semantic DB
Scroll down to the bottom of that page where there are a couple of examples of the predict-next, and fuzzy-predict-next operators.

The idea behind that page is you have some integer sequences, I chose some simple ones counting, primes, Fibonacci and factorial, and then you learn fragments of those sequences using an integer encoder (I used Gaussians for the encoder). Then after it has learnt all the “if-then machines” you input fragments of an integer sequence and it predicts the next few digits, or fuzzy predicts the next few digits. But it also tells you the name of which sequence it thinks it belongs too.

Let me include the output here. First, the predict-next operator that only returns 100% matches:

sa: predict-next |1 2 3 4 5>
100 %      integer sequence: counting      pattern:     1 2 3 4 5      next-1:     6
100 %      integer sequence: counting      pattern:     1 2 3 4 5      next-2:     6 7
100 %      integer sequence: counting      pattern:     1 2 3 4 5      next-3:     6 7 8
100 %      integer sequence: counting      pattern:     1 2 3 4 5      next-4:     6 7 8 9
4|results>

sa: predict-next |2 3 5 8>
100 %      integer sequence: fibonacci      pattern:     2 3 5 8      next-1:     13
100 %      integer sequence: fibonacci      pattern:     2 3 5 8      next-2:     13 21
100 %      integer sequence: fibonacci      pattern:     2 3 5 8      next-3:     13 21 34
100 %      integer sequence: fibonacci      pattern:     2 3 5 8      next-4:     13 21 34 55
4|results>

sa: predict-next |2 6 24>
100 %      integer sequence: factorial      pattern:     2 6 24      next-1:     120
100 %      integer sequence: factorial      pattern:     2 6 24      next-2:     120 720
100 %      integer sequence: factorial      pattern:     2 6 24      next-3:     120 720 5040
100 %      integer sequence: factorial      pattern:     2 6 24      next-4:     120 720 5040 40320
4|results>

sa: predict-next |2 3 5 7>
100 %      integer sequence: primes      pattern:     2 3 5 7      next-1:     11
100 %      integer sequence: primes      pattern:     2 3 5 7      next-2:     11 13
100 %      integer sequence: primes      pattern:     2 3 5 7      next-3:     11 13 17
100 %      integer sequence: primes      pattern:     2 3 5 7      next-4:     11 13 17 19
4|results>

sa: predict-next |9 9 9>
|Anomaly, no sequence detected ... >

Now the fuzzy-predict next operator, which gives fuzzy predictions:

sa: fuzzy-predict-next |2 3 5 7 11>
100 %      integer sequence: primes      pattern:     2 3 5 7 11      next-1:     13
100 %      integer sequence: primes      pattern:     2 3 5 7 11      next-2:     13 17
100 %      integer sequence: primes      pattern:     2 3 5 7 11      next-3:     13 17 19
100 %      integer sequence: primes      pattern:     2 3 5 7 11      next-4:     13 17 19 23
87.1 %      integer sequence: fibonacci      pattern:     2 3 5 8 13      next-4:     21 34 55 89
87.1 %      integer sequence: fibonacci      pattern:     2 3 5 8 13      next-3:     21 34 55
87.1 %      integer sequence: fibonacci      pattern:     2 3 5 8 13      next-2:     21 34
87.1 %      integer sequence: fibonacci      pattern:     2 3 5 8 13      next-1:     21
72.6 %      integer sequence: counting      pattern:     2 3 4 5 6      next-4:     7 8 9 10
72.6 %      integer sequence: counting      pattern:     2 3 4 5 6      next-3:     7 8 9
72.6 %      integer sequence: counting      pattern:     2 3 4 5 6      next-2:     7 8
72.6 %      integer sequence: counting      pattern:     2 3 4 5 6      next-1:     7
72.4 %      integer sequence: counting      pattern:     3 4 5 6 7      next-4:     8 9 10 11
72.4 %      integer sequence: counting      pattern:     3 4 5 6 7      next-2:     8 9
72.4 %      integer sequence: counting      pattern:     3 4 5 6 7      next-3:     8 9 10
72.4 %      integer sequence: counting      pattern:     3 4 5 6 7      next-1:     8
68.3 %      integer sequence: counting      pattern:     4 5 6 7 8      next-2:     9 10
68.3 %      integer sequence: counting      pattern:     4 5 6 7 8      next-4:     9 10 11 12
68.3 %      integer sequence: counting      pattern:     4 5 6 7 8      next-3:     9 10 11
68.3 %      integer sequence: counting      pattern:     4 5 6 7 8      next-1:     9
63.3 %      integer sequence: fibonacci      pattern:     1 2 3 5 8      next-3:     13 21 34
63.3 %      integer sequence: fibonacci      pattern:     1 2 3 5 8      next-2:     13 21
63.3 %      integer sequence: fibonacci      pattern:     1 2 3 5 8      next-1:     13
63.3 %      integer sequence: fibonacci      pattern:     1 2 3 5 8      next-4:     13 21 34 55
58.5 %      integer sequence: primes      pattern:     3 5 7 11 13      next-2:     17 19
58.5 %      integer sequence: primes      pattern:     3 5 7 11 13      next-1:     17
58.5 %      integer sequence: primes      pattern:     3 5 7 11 13      next-3:     17 19 23
58.5 %      integer sequence: primes      pattern:     3 5 7 11 13      next-4:     17 19 23 29
57.3 %      integer sequence: counting      pattern:     5 6 7 8 9      next-3:     10 11 12
57.3 %      integer sequence: counting      pattern:     5 6 7 8 9      next-4:     10 11 12 13
57.3 %      integer sequence: counting      pattern:     5 6 7 8 9      next-2:     10 11
57.3 %      integer sequence: counting      pattern:     5 6 7 8 9      next-1:     10
55.6 %      integer sequence: counting      pattern:     1 2 3 4 5      next-2:     6 7
55.6 %      integer sequence: counting      pattern:     1 2 3 4 5      next-4:     6 7 8 9
55.6 %      integer sequence: counting      pattern:     1 2 3 4 5      next-1:     6
55.6 %      integer sequence: counting      pattern:     1 2 3 4 5      next-3:     6 7 8
50.7 %      integer sequence: counting      pattern:     6 7 8 9 10      next-2:     11 12
50.7 %      integer sequence: counting      pattern:     6 7 8 9 10      next-3:     11 12 13
50.7 %      integer sequence: counting      pattern:     6 7 8 9 10      next-1:     11
50.7 %      integer sequence: counting      pattern:     6 7 8 9 10      next-4:     11 12 13 14
40|results>

sa: fuzzy-predict-next |9 9 9>
83.5 %      integer sequence: counting      pattern:     8 9 10      next-2:     11 12
83.5 %      integer sequence: counting      pattern:     8 9 10      next-3:     11 12 13
83.5 %      integer sequence: counting      pattern:     8 9 10      next-4:     11 12 13 14
83.5 %      integer sequence: counting      pattern:     8 9 10      next-1:     11
78.5 %      integer sequence: counting      pattern:     9 10 11      next-2:     12 13
78.5 %      integer sequence: counting      pattern:     7 8 9      next-3:     10 11 12
78.5 %      integer sequence: counting      pattern:     9 10 11      next-3:     12 13 14
78.5 %      integer sequence: counting      pattern:     7 8 9      next-2:     10 11
78.5 %      integer sequence: counting      pattern:     9 10 11      next-4:     12 13 14 15
78.5 %      integer sequence: counting      pattern:     9 10 11      next-1:     12
78.5 %      integer sequence: counting      pattern:     7 8 9      next-4:     10 11 12 13
78.5 %      integer sequence: counting      pattern:     7 8 9      next-1:     10
60.3 %      integer sequence: counting      pattern:     10 11 12      next-1:     13
60.3 %      integer sequence: counting      pattern:     6 7 8      next-2:     9 10
60.3 %      integer sequence: counting      pattern:     6 7 8      next-4:     9 10 11 12
60.3 %      integer sequence: counting      pattern:     10 11 12      next-4:     13 14 15 16
60.3 %      integer sequence: counting      pattern:     10 11 12      next-3:     13 14 15
60.3 %      integer sequence: counting      pattern:     10 11 12      next-2:     13 14
60.3 %      integer sequence: counting      pattern:     6 7 8      next-3:     9 10 11
60.3 %      integer sequence: counting      pattern:     6 7 8      next-1:     9
52.3 %      integer sequence: primes      pattern:     7 11 13      next-4:     17 19 23 29
52.3 %      integer sequence: primes      pattern:     5 7 11      next-3:     13 17 19
52.3 %      integer sequence: primes      pattern:     7 11 13      next-3:     17 19 23
52.3 %      integer sequence: primes      pattern:     5 7 11      next-4:     13 17 19 23
52.3 %      integer sequence: primes      pattern:     7 11 13      next-2:     17 19
52.3 %      integer sequence: primes      pattern:     5 7 11      next-2:     13 17
52.3 %      integer sequence: primes      pattern:     5 7 11      next-1:     13
52.3 %      integer sequence: primes      pattern:     7 11 13      next-1:     17
28|results>

One final note, I am a week or two away from an alpha release of a GUI version of the project.
See here (note the most recent code is in the working branch): GitHub - GarryMorrison/SemanticDB4: A graphical user interface for the Semantic DB

And another final thought. I have tried some ideas for MNIST several times. Either SDB is not suitable for MNIST, or I am not smart enough to work out how to apply it! :slight_smile:

1 Like