Implementing SDR long term memory?

The question I want to ask is :

What is the mechanism to store and recall SDR’s i.e. how would you implement SDR-long-term-memory ?

The specs should at least implement :

  1. Store large amount of SDR’s … millions ?
  2. Like KV-store i.e. given a key SDR return target SDR
  3. Given partially-distorted OR simply partial-SDR return back cleaned full SDR or list of SDR’s
  4. … what does long term memory OPS can be ?

How would you do that ?
What does neuroscience says ?

2 Likes

I don’t know the neural science part of the matter. But the standard TM does the job. Given any SDR, first order memory will find the stored representation of t+1. If you really want to. You can train a TM with K on step 0 and V on step 1. Then higher order memory won’t kick in. Effectively becoming a KV store.

3 Likes

what about point 3 ? any idea

Well, you could do it using the Semantic DB, but it would require quite a bit of background to do so! Let me try to give an abbreviated version. The SDB is a knowledge representation language/notation I have been working on for a long time now. The original motivation was quantum mechanics, but that doesn’t matter here. But it does explain the origin of the names ‘kets’ and ‘superpositions’. Superpositions are a fundamental data-type in the SDB, and are a kind of generalization of SDR’s. The key thing in this write-up is that they can represent SDR’s. So for example, this set of on bits:
[13, 27, 18, 39, 45, 100, 113]

Is represented by this superposition:
|13> + |27> + |18> + |39> + |45> + |100> + |113>

I won’t go into any more detail than that, but just note we have one ket per on bit. In the back-end c++ code, the strings inside kets are mapped to unique integers, but again, the exact details are not important here. Kets are written as ‘c|s>’, where ‘c’ is some float, and ‘s’ is an arbitrary string. If ‘c’ is not specified it defaults to 1. If ‘c’ is not equal to 1, then our superpositions become a more general object than a standard binary SDR, and represent fuzzier concepts. But again, that is beyond what we need today.

OK. So the language supports superpositions/SDR’s, but what can you do with them? And hence we introduce learn rules. A learn rule simply associates a superposition (or more generally a sequence) with a ket, by way of a text label. A little like RDF triple stores in fact. Once such a triple is defined, that text label is now an operator that we can apply to the ket, and we will get back the defined superposition. Here is a for-example learn rule:
operator-label |some ket> => some-superposition

Once this is entered into our SDB shell (which has the command prompt sa:), ‘operator-label’ applied to ‘|some ket>’ will return ‘some-superposition’. Eg:

-- define our first learn rule:
sa: operator-label |some ket> => some-superposition

-- invoke our new operator:
sa: operator-label |some ket>
some-superposition

So, if we want to store key/value pairs of SDR’s, we would use learn rules with the following structure:

key |sdr 1> => key-sp-1
value |sdr 1> => value-sp-1

key |sdr 2> => key-sp-2
value |sdr 2> => value-sp-2

...

key |sdr n> => key-sp-n
value |sdr n> => value-sp-n

We can store a large number of these key/value pairs, but I doubt we could get to the millions, unfortunately. But I haven’t tested this. It would however be limited by system memory and the back-end c++.

This is all rather abstract at this point, let’s actually demonstrate it in action. We create a text file called sdr-long-term-memory.sw3 and then load it into the shell. Here is the contents of our desired file:

|context> => |storing and retrieving SDR's>

-- define the full range of available on bits for our SDR's:
-- in this case, 1 .. 100.
full |range> => seq2sp srange(|1>, |100>) |>


-- now randomly define 4 SDR key/value pairs:
key |sdr 1> => pick[5] full |range>
value |sdr 1> => pick[5] full |range>

key |sdr 2> => pick[5] full |range>
value |sdr 2> => pick[5] full |range>

key |sdr 3> => pick[5] full |range>
value |sdr 3> => pick[5] full |range>

key |sdr 4> => pick[5] full |range>
value |sdr 4> => pick[5] full |range>


-- now define our operators that interact with our key/value pairs:
strict-retrieve-value (*) #=> value drop-below[0.98] similar-input[key] |__self>
strict-retrieve-SDR-number (*) #=> drop-below[0.98] similar-input[key] |__self>
fuzzy-retrieve-SDR-number (*) #=> drop-below[0.6] similar-input[key] |__self>

Let’s try to unpack what is going on here. The first line just defines the context. A context is a collection of learn rules that is independent of learn rules in other contexts. In this case we only have one context, so it is not super critical to define it, but no harm in doing so. Next we have a learn rule that defines the set of possible on bits that our key/value SDR’s will sample from. In this case the integers from 1 to 100. It is trivial to define other sets of on bits, depending on what you are trying to achieve. Next we define 4 sets of random key/value SDR pairs. We use pick[n] to randomly sample n kets from the superposition defined by ‘full |range>’. In this case SDR’s with 5 on bits, but simply change 5 to the value you require. Finally, we define the operators we need to implement the key/value retrieval. There is a lot to unpack in how those operators work, but the central idea is that given an input, here the ‘|__self>’ ket, ‘similar-input[key]’ compares that input to every ket that has the ‘key’ operator defined, and then returns matching kets with an associated similarity of that match (passed as the coefficient of the ket). drop-below[t] then filters the result to kets with at least t percent similarity. We will demonstrate this shortly.

After loading into the shell, we now know:

$ ./SDB3_1 -i ../sw-examples/sdr-long-term-memory.sw3

sa: dump

|context> => |storing and retrieving SDR's>

full |range> => |1> + |2> + |3> + |4> + |5> + |6> + |7> + |8> + |9> + |10> + |11> + |12> + |13> + |14> + |15> + |16> + |17> + |18> + |19> + |20> + |21> + |22> + |23> + |24> + |25> + |26> + |27> + |28> + |29> + |30> + |31> + |32> + |33> + |34> + |35> + |36> + |37> + |38> + |39> + |40> + |41> + |42> + |43> + |44> + |45> + |46> + |47> + |48> + |49> + |50> + |51> + |52> + |53> + |54> + |55> + |56> + |57> + |58> + |59> + |60> + |61> + |62> + |63> + |64> + |65> + |66> + |67> + |68> + |69> + |70> + |71> + |72> + |73> + |74> + |75> + |76> + |77> + |78> + |79> + |80> + |81> + |82> + |83> + |84> + |85> + |86> + |87> + |88> + |89> + |90> + |91> + |92> + |93> + |94> + |95> + |96> + |97> + |98> + |99> + |100>

key |sdr 1> => |29> + |72> + |34> + |80> + |24>
value |sdr 1> => |77> + |33> + |48> + |10> + |96>

key |sdr 2> => |77> + |97> + |54> + |3> + |29>
value |sdr 2> => |26> + |3> + |90> + |84> + |100>

key |sdr 3> => |17> + |75> + |42> + |69> + |72>
value |sdr 3> => |74> + |17> + |13> + |6> + |30>

key |sdr 4> => |98> + |73> + |97> + |53> + |94>
value |sdr 4> => |37> + |2> + |78> + |4> + |50>


strict-retrieve-value (*) #=>  value drop-below[0.980000] similar-input[key]|__self>
strict-retrieve-SDR-number (*) #=>  drop-below[0.980000] similar-input[key]|__self>
fuzzy-retrieve-SDR-number (*) #=>  drop-below[0.600000] similar-input[key]|__self>

where the ‘dump’ command returns all currently known knowledge. Here we can see our defined learn rules, and their associated superpositions. Note of course that the SDR’s were randomly generated, so when you run the code yours will be different than those given here.

Now, onto the 3 tasks:
Task 1: Yes, we can store a large number of key/value SDR pairs. The exact upper bound on how many is unknown.

Task 2: Yes, given an input/key SDR we can return an output/value SDR. In this case we will use the ‘strict-retrieve-value’ operator.
In the shell,

sa: strict-retrieve-value (|29> + |72> + |34> + |80> + |24>)
|77> + |33> + |48> + |10> + |96>

sa: strict-retrieve-value (|17> + |75> + |42> + |69> + |72>)
|74> + |17> + |13> + |6> + |30>

Not much more to say on that. We input ‘key |sdr 1>’ and get ‘value |sdr 1>’. We input ‘key |sdr 3>’ and get ‘value |sdr 3>’. Note that the order of the kets in the input/key SDR does not matter. The similarity measure of two superpositions is the same independent on the order of the kets in those superpositions. Use the operator simm() to measure the similarity of superpositions. Eg, simm(sp-1, sp-2). By the way, our similarity measure returns values in the range 0 to 1. 0 for no kets in common, 1 for exact matches, values in between otherwise.

Task 3: Follows similarly to task 2. The key difference is the similarity threshold, which we define using ‘drop-below[t]’. Drop-below[t] removes all kets from the given superposition that have coefficient less than t. Since ‘similar-input[]’ sets the coefficient of the kets to the similarity result, the operator sequence ‘drop-below[t] similar-input[op]’ filters the results from ‘similar-input[op]’ to those that are at least t percent similar.

To demonstrate this, we will pick this partial SDR of SDR-3:
|17> + |69> + |72>

First, we show that if we use strict recall, we get back no matches. In fact we get back the ‘I don’t know’ ket |>

sa: strict-retrieve-SDR-number (|17> + |69> + |72>)
|>

But if we use fuzzy recall, we get the desired match. And we see the partial SDR and the full SDR have 60% similarity:

sa: fuzzy-retrieve-SDR-number (|17> + |69> + |72>)
0.600000|sdr 3>

There is a lot more to the Semantic DB than this! Unfortunately I don’t have a formal paper describing the project, but as a starting point I have the README on github. Code here:

The previous python implementation is here, which has another README:

Feel free to contact me at: garry -at- semantic-db.org

1 Like

TM can recall even with noise/loss data. So I think it’ll work (unless the input is so corrupted that it is unrecognizable).

I’ve did this in my RL project to force a TM to predict future motor commands in the absent of DT-errors. This is a very hacky way of doing it. TM is not meant to be used as a KV-store. But it can be forced into one. Though I don’t know how well it works quantitatively.

2 Likes

How fast is this ? I suspect it will be slower than a numpy array (or when readonly : dask or vaex, those can hold billions of items)
Or may be some sort of BinTreeMap structure when using indexed-SDR

I expected something along the lines of associative memory like Hopfield network ! Because the one-to-one storage grows pretty fast.

Heh, if you are looking for speed, then SDB is not for you :smiley: I would expect numpy to be significantly faster.
The c++ implementation of the SDB is faster than the python one but still slower than I would like.
There is probably room for optimization, but fundamentally it will be slow because of the way operator sequences work. Each step reads in a superposition or sequence, and then returns a new modified one.