A toy symbolic system partly inspired by HTM theory

garry.morrison · June 9, 2017, 3:43am

The goal of this was to replicate the idea of offline learning some sequences, then given an input sequence predict which sequence it belongs to, with some tolerance for noise. And that is it. With the learnt sequences represented using HTM like mini-columns providing each digit with its own context. For example, the start of the sequence Pi 3.14159 is encoded as 3’ 1’ 4’ 1’’ 5’ 9’, ignoring the decimal point for simplicity, where A’ is the mini-column version of the SDR for A. Though to be clear, our SDR’s have coefficients of floats, not {0,1}, but conceptually, and in terms of properties, they are essentially identical to binary SDR’s.

The approximate mathematics this is doing:
Given an input sequence |v1 . v2 . v3>
find x such that f(x) approx-eq v1 and f(x + delta) approx-eq v2 and f(x + 2*delta) approx-eq v3
where the exact meaning of “a approx-eq b” is a consequence of how you define your encode operator.
Indeed, presumably, if we used an encoder that maps words to SDR’s such as cortical.io do, then an input sequence of “the tiger ate a sheep” should match a learnt sequence of “the lion ate a lamb”. And something similar for learning and recalling melodies.

Here is the definition of our scalar encoder, which probably doesn’t make too much sense, but approximates a Gaussian:
encode |> #=> rescale smooth[0.1]^10 |_self>
If we want a binary SDR instead we could perhaps use:
encode |> #=> clean smooth[0.1]^5 |_self>
but we will stick with our Gaussian for now.

In particular, here is our scalar encoding of “10”:
sa: encode |10>
0.0|9> + 0.0|9.1> + 0.001|9.2> + 0.006|9.3> + 0.026|9.4> + 0.084|9.5> + 0.21|9.6> + 0.42|9.7> + 0.682|9.8> + 0.909|9.9> + |10> + 0.909|10.1> + 0.682|10.2> + 0.42|10.3> + 0.21|10.4> + 0.084|10.5> + 0.026|10.6> + 0.006|10.7> + 0.001|10.8> + 0.0|10.9> + 0.0|11>

This encoder has nice similarity properties, with respect to our similarity measure (we don’t use dot product):
sa: ket-simm(encode |10>, encode |10>)
|simm>

sa: ket-simm(encode |10>, encode |10.5>)
0.263|simm>

sa: ket-simm(encode |10>, encode |11>)
0.027|simm>

sa: ket-simm(encode |10>, encode |12>)
0.0|simm>

But depending on what you are doing you might want your Gaussians to be wider. That’s for the future.

Given a scalar encoder we can now offline learn some sample digits of our two sequences Pi and e. See end of post(1) for full details. But the idea should generalize easily to any sequence of floats. Though as a consequence of our current proof of concept scalar encoder the floats must be limited to one decimal place. Recall, this is just a toy for now! But should be fixable when we implement a full Gaussian encoder. Indeed, a full Gaussian encoder should also enable sequences of 2D or 3D co-ordinates too, which are potentially more interesting. Again, details left for the future.

How does our code represent sequences? Basically we use a very simplified model of a neuron: given an input SDR, predict an output SDR. And then use a chain of these to represent full sequences. Note that the idea of mini-columns is critical to this, which in my code is implemented using the random-column[10] operator. Otherwise we couldn’t represent sequences with repeat digits, and we could only represent one sequence at a time.

So, how does random-column[k] work? For a ket with D dimensions it maps that ket to D+1 dimensions, with a random value in the new dimension, in range {0,1,… ,k-1}. The effect being that our SDR’s are unique each time we use them in a sequence. cf. HTM theory.

Here we have 1D kets mapped to 2D:
sa: random-column[10] (|x1> + |x2> + |x3>)
|x1: 7> + |x2: 0> + |x3: 3>

Here we have 2D kets mapped to 3D:
sa: random-column[10] (|x1: y1> + |x2: y2> + |x3: y3>)
|x1: y1: 2> + |x2: y2: 9> + |x3: y3: 9>

And it is random, so each invoke provides a different mapping:
sa: random-column[10] (|x1> + |x2> + |x3>)
|x1: 4> + |x2: 6> + |x3: 4>

sa: random-column[10] (|x1: y1> + |x2: y2> + |x3: y3>)
|x1: y1: 5> + |x2: y2: 1> + |x3: y3: 8>

And like HTM theory, the probability of a collision is fairly small, and can be made smaller by increasing k. By the way, we can undo this mapping by using the extract-category operator, and we use this in our code, but I won’t mention details here.

Finally to some examples.
Let’s input a sequence of a single integer. Let’s try ‘2’, and then ‘3’:
sa: float-sequence |2>
e 1.0 |2> . |7> . |1> . |8> . |2> . |8> . |1> . |8> . |2> . |8> . |4>
Pi 1.0 |2> . |6> . |5> . |3> . |5>
e 1.0 |2> . |8> . |1> . |8> . |2> . |8> . |4>
e 1.0 |2> . |8> . |4>
Pi 0.071 |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi 0.071 |3> . |5>
e 0.071 |1> . |8> . |2> . |8> . |1> . |8> . |2> . |8> . |4>
Pi 0.071 |3> . |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi 0.071 |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
e 0.071 |1> . |8> . |2> . |8> . |4>
Pi 0.0 |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
e 0.0 |4>
|float-sequence>

sa: float-sequence |3>
Pi 1.0 |3> . |5>
Pi 1.0 |3> . |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi 0.071 |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi 0.071 |2> . |6> . |5> . |3> . |5>
e 0.071 |2> . |7> . |1> . |8> . |2> . |8> . |1> . |8> . |2> . |8> . |4>
e 0.071 |2> . |8> . |4>
e 0.071 |4>
e 0.071 |2> . |8> . |1> . |8> . |2> . |8> . |4>
Pi 0.0 |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi 0.0 |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
e 0.0 |1> . |8> . |2> . |8> . |1> . |8> . |2> . |8> . |4>
e 0.0 |1> . |8> . |2> . |8> . |4>
|float-sequence>

Where, the first column is the name of the predicted sequence. The second is the similarity score. The third is a walk of the matched sequence, and hence the sequence prediction given the input. (BTW, sorry about the formatting, I don’t know how to specify a fixed-width font. If someone could fix it, that would be great!).
Let’s try again, but this time with a non-integer:

sa: float-sequence |2.5>
Pi 0.619 |3> . |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi 0.619 |3> . |5>
e 0.619 |2> . |8> . |1> . |8> . |2> . |8> . |4>
Pi 0.619 |2> . |6> . |5> . |3> . |5>
e 0.619 |2> . |7> . |1> . |8> . |2> . |8> . |1> . |8> . |2> . |8> . |4>
e 0.619 |2> . |8> . |4>
Pi 0.001 |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi 0.001 |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi 0.001 |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
e 0.001 |1> . |8> . |2> . |8> . |1> . |8> . |2> . |8> . |4>
e 0.001 |4>
e 0.001 |1> . |8> . |2> . |8> . |4>
|float-sequence>

And we see ‘2.5’ matches ‘3’ and ‘2’ with the same probability, just as you would expect.
Now again, but with a longer input sequence:

sa: float-sequence |3.3 . 1 . 4.2>
Pi 0.664 |3> . |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi 0.078 |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
Pi 0.0 |5> . |3> . |5>
|float-sequence>

Given the longer input sequence, we have now converged down to 3 matching sequences. The first one is a 66.4% match, since 3.3 approx-eq 3, 1 == 1 and 4.2 approx-eq 4. The second one a 7.8% match, since 3.3 approx-eq 4, 1 == 1, and 4.2 approx-eq 5. The third one is a 0% match, though it must be slightly above 0 or else it wouldn’t be displayed.

If we add one more digit we can filter down to a unique sequence:
Eg:
sa: float-sequence |3.3 . 1 . 4.2 . 1>
Pi 0.664 |3> . |1> . |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
|float-sequence>

Or:
sa: float-sequence |3.3 . 1 . 4.2 . 8.7>
Pi 0.078 |4> . |1> . |5> . |9> . |2> . |6> . |5> . |3> . |5>
|float-sequence>

And finally, if the input sequence is too far from all the learnt sequences we get the empty, or don’t know, ket |>.
sa: float-sequence |7 . 7 . 7>
|>
In HTM, this would probably be the point at which you alert an anomaly. ie, no matching sequence.

Now, another example, this time using words not floats, but using the same back-end code. And instead of the scalar encoder, we map words to random SDR’s. See (2). So there should be no similarity between different words. It’s kind of black and white this time, unlike our float example. We would need something like cortical.io if we wanted similarity between words.

The two sentences we have learnt are (borrowed from this HTM school video https://www.youtube.com/watch?v=UBzemKcUoOk):
“boys eat many cakes” and “girls eat many pies”.

And that’s it. Note that this is a toy system, so I don’t think it’s actually useful for anything, except perhaps demonstration of some ideas.
Thanks for your time.

– define our proof of concept scalar encoder:
encode |*> #=> rescale smooth[0.1]^10 |_self>

– learn the sequence of digits of Pi:
– Pi
– 3 1 4 1 5 9 2 6 5 3 5
– name the sequence:
sequence-name |node: 1: *> => |Pi>

pattern |node: 1: 1> => random-column[10] encode |3>
then |node: 1: 1> => random-column[10] encode |1>

pattern |node: 1: 2> => then |node: 1: 1>
then |node: 1: 2> => random-column[10] encode |4>

pattern |node: 1: 3> => then |node: 1: 2>
then |node: 1: 3> => random-column[10] encode |1>

pattern |node: 1: 4> => then |node: 1: 3>
then |node: 1: 4> => random-column[10] encode |5>

pattern |node: 1: 5> => then |node: 1: 4>
then |node: 1: 5> => random-column[10] encode |9>

pattern |node: 1: 6> => then |node: 1: 5>
then |node: 1: 6> => random-column[10] encode |2>

pattern |node: 1: 7> => then |node: 1: 6>
then |node: 1: 7> => random-column[10] encode |6>

pattern |node: 1: 8> => then |node: 1: 7>
then |node: 1: 8> => random-column[10] encode |5>

pattern |node: 1: 9> => then |node: 1: 8>
then |node: 1: 9> => random-column[10] encode |3>

pattern |node: 1: 10> => then |node: 1: 9>
then |node: 1: 10> => random-column[10] encode |5>

pattern |node: 1: 11> => then |node: 1: 10>
then |node: 1: 11> => random-column[10] encode |end of sequence>

– learn the sequence of digits of e:
– e
– 2 7 1 8 2 8 1 8 2 8 4
– name the sequence:
sequence-name |node: 2: *> => |e>

pattern |node: 2: 1> => random-column[10] encode |2>
then |node: 2: 1> => random-column[10] encode |7>

pattern |node: 2: 2> => then |node: 2: 1>
then |node: 2: 2> => random-column[10] encode |1>

pattern |node: 2: 3> => then |node: 2: 2>
then |node: 2: 3> => random-column[10] encode |8>

pattern |node: 2: 4> => then |node: 2: 3>
then |node: 2: 4> => random-column[10] encode |2>

pattern |node: 2: 5> => then |node: 2: 4>
then |node: 2: 5> => random-column[10] encode |8>

pattern |node: 2: 6> => then |node: 2: 5>
then |node: 2: 6> => random-column[10] encode |1>

pattern |node: 2: 7> => then |node: 2: 6>
then |node: 2: 7> => random-column[10] encode |8>

pattern |node: 2: 8> => then |node: 2: 7>
then |node: 2: 8> => random-column[10] encode |2>

pattern |node: 2: 9> => then |node: 2: 8>
then |node: 2: 9> => random-column[10] encode |8>

pattern |node: 2: 10> => then |node: 2: 9>
then |node: 2: 10> => random-column[10] encode |4>

pattern |node: 2: 11> => then |node: 2: 10>
then |node: 2: 11> => random-column[10] encode |end of sequence>

– learn “boys eat many cakes”:
– boy sentence
– name the sequence:
sequence-name |node: 1: *> => |boy sentence>

pattern |node: 1: 1> => random-column[10] encode |boys>
then |node: 1: 1> => random-column[10] encode |eat>

pattern |node: 1: 2> => then |node: 1: 1>
then |node: 1: 2> => random-column[10] encode |many>

pattern |node: 1: 3> => then |node: 1: 2>
then |node: 1: 3> => random-column[10] encode |cakes>

pattern |node: 1: 4> => then |node: 1: 3>
then |node: 1: 4> => random-column[10] encode |end of sequence>

– learn “girls eat many pies”:
– girl sentence
– name the sequence:
sequence-name |node: 2: *> => |girl sentence>

pattern |node: 2: 1> => random-column[10] encode |girls>
then |node: 2: 1> => random-column[10] encode |eat>

pattern |node: 2: 2> => then |node: 2: 1>
then |node: 2: 2> => random-column[10] encode |many>

pattern |node: 2: 3> => then |node: 2: 2>
then |node: 2: 3> => random-column[10] encode |pies>

pattern |node: 2: 4> => then |node: 2: 3>
then |node: 2: 4> => random-column[10] encode |end of sequence>

Topic		Replies	Views
'Cognitive Computing' using HTM Engineering sequence-memory , sdrs	0	781	October 9, 2016
HTM + Logic for sequence learning Machine Learning sequence-memory	2	478	November 16, 2023
HTMs as Serial Associative Memories Getting Started question	4	573	May 5, 2020
Compiling all the information into something cohesive Getting Started sequence-memory , research , encoders , question , community	3	979	October 15, 2020
Can we deduce a Universal Encoder algorithm via the Fundamentals? Tangential Theories encoders , question , sdrs , auto-association , htm	28	2479	April 4, 2017

A toy symbolic system partly inspired by HTM theory

Related topics