Triadic Memory — A Fundamental Algorithm for Cognitive Computing

I added speed tests for triadic memory when querying for X and Y - X is 20x slower than queries for Z, Y is only 10x slower

Testing TriadicMemory with 100000 SDRs of size 1000 and solidity of 10 ON bits

100000 vectors generated in 1602 ms
100000 triples stored in 1524 ms
Begin retrieval test…
100000 queries for Z in 6626 ms
searching for errors…
0 errors - see diferents variable
Now query for X-es…
100000 queries for X in 120780 ms
0 errors - see x_diferents variable
Now query for Y-es…
100000 queries for Y in 66440 ms
0 errors - see y_diferents variable

It seems depending on numpy’s abilities is problematic… will see if/how can be numbified.

PS: the sdrsdm.py at least works correctly now…

That’s a clever trick! You’re storing the triplets in two different memory order at the same place (using higher and lower nibble) to take advantage of memory line cache… I used your trick and got a nice improvement for z,x queries… y remains slow obviously but maybe with 16 bit storage this can also be fixed!
Thanks!

Before:

julia> include("triadicmemory.jl")
inserting 10000 vectors
preparing 10000 vectors took 0.224970541 seconds: 44450.26 per sec
inserting 10000 vectors took 0.120853452 seconds: 82744.84 per sec
  **5.04370**7 seconds (2.33 M allocations: 2.086 GiB, 1.03% gc time, 0.73% compilation time)
querying z 10000 vectors took 5.043705869 seconds: 1982.67 per sec, 0 errors
  **5.448775** seconds (2.30 M allocations: 2.085 GiB, 1.70% gc time, 0.36% compilation time)
querying y 10000 vectors took 5.448774256 seconds: 1835.28 per sec, 0 errors
  **0.511984** seconds (2.30 M allocations: 2.085 GiB, 12.17% gc time, 3.81% compilation time)
querying x 10000 vectors took 0.511983305 seconds: 19531.89 per sec, 0 errors

after


julia> include("triadicmemory.jl")
inserting 10000 vectors
preparing 10000 vectors took 0.222409913 seconds: 44962.02 per sec
inserting 10000 vectors took 0.222854019 seconds: 44872.42 per sec
  **0.851685** seconds (2.30 M allocations: 3.038 GiB, 15.34% gc time, 2.38% compilation time)
querying z 10000 vectors took 0.851684057 seconds: 11741.44 per sec, 0 errors
  **5.694842** seconds (2.28 M allocations: 3.037 GiB, 4.96% gc time)
querying y 10000 vectors took 5.694841575 seconds: 1755.98 per sec, 0 errors
  **0.845069** seconds (2.48 M allocations: 3.047 GiB, 16.27% gc time, 4.28% compilation time)
querying x 10000 vectors took 0.845068203 seconds: 11833.36 per sec, 0 errors
2 Likes

@rogert, @LiorA,
whatever magic you’re doing here, would that be possible in the C version?

Since the write times doubles I guess they write the data in two ways each aligned to be fastest for a different query.
I guess they split each byte in two 4 bit numbers.

An useful property would be ability to handle many writes as a form of refresh.

Exactly. Depends on counts not going over 15 (although the Scheme code should detect that apparently unlikely overflow and revert to 8 bits).

1 Like

Wow!!! I have looked through the PDF. This looks interesting. How could we use this in practice?

2 Likes

I’d suggest to have a look at the temporal associative memory algorithm and test it against HTM.

There’s a C implementation (command line tool or library) available on GitHub.

3 Likes

I got my brain tangled trying to write a python version of the temporal algorithm.

For me this area looks too interesting to leave it unexplored. Anybody’s help with translating algorithms in different languages would be really appreciated.

Look very interesting! But I am very interested in any basic demo for learning and prediction abilities like an app e.g. with sinus waveform signals ? Thanks

1 Like

The screencap above is prediction. In red are characters entered by user (the model did not predict those) in black are predictions which were self-fed into the predictor.
Each new character it sees is assigned a random SDR.
Now for the sine prediction that would be interesting too.
My feeling is if it can remember and predict 50 digits of pi it should be able to do the same with a similar size stream encoding a sine cycle

Since I was playing with this algorithm a lot, I made a formal Julia package out of it - it’s been merged today and is publicly available for anyone interested…

https://juliahub.com/ui/Packages/TriadicMemory/gAvKO/0.1.0

3 Likes

I pushed a draft attempt to replicate the C temporal memory functions in python. See seq_learn.py here

See comments upon what I’m not happy about.
Main one being I actually do not understand the algorithm.
It resembles ideas exposed here by @sebjwallace but C version looks like assembly code with 5 register names because… I dont know, I spent too much time with that.

2 Likes

Thanks for porting this method to Python, and for your feedback! This is really useful – obviously I need to write down a better explanation of the temporal memory algorithm.

It can be instrumented for sequence prediction, completion or generation, but the original intent was just a self-supervised temporal memory: You feed it with a continuous stream of SDRs, and it learns whenever there’s something unexpected. At each step it puts out a a prediction before seeing the next step.

In the example I recently posted, the user input consists of a single string of characters fed to the temporal memory in one session. I just split it into sections for the sake of readibility. Within this input string, there are repeating subpatterns, like ABCABCABC. Characters colored black are correctly predicted, and red characters were incorrectly predicted (or, if you want, recognized as anomalies). When switching from one repeated subpattern to a different one, you see how the first few characters are mispredicted, and how the system leans the new pattern.

Just looking at the program code, this algorithm is indeed hard to understand. At the core of it, it’s a simple feedback circuit consisting of two triadic memory instances. Information flows and is processed in discrete time steps. Each line carries an SDR:

What you see here is a basic example for building circuits out of triadic memory units. Note that there are no parameters (except N and P), we’re just wiring up components. Larger circuits can be built to model more complex functionality, for example a temporal memory that can recognize symmetries, or hierarchies. You could call this an electrical engineering approach to modelling a brain, with associative memories being the main building blocks.

2 Likes

+1 on request for a more detailed explanation on how the prediction using two M work…
Continuing the EE approach, what the state of each of the registers at each time step ?
What’s the “pipeline” like ?

Ok, here’s an animation how a sequence ABCDEF flows through an initially blank circuit:

The triadic memory units operate only when all three positions are filled with non-zero data, so they do nothing in the first two steps.

At step 3, triadic memory M1 generates a random vector f1 and stores the triple {f1, A,B}.

At step 4, M2 learns that C follows AB, storing the triple {A,B,C}.

Note that the feedback vectors f accumulate history: f2 is associated with C and B+f1, where f1 carries information about A from the previous step.

The prediction step is not explicitly shown in the diagram. It’s a query on M2 right after its two bottom registers are filled with new values propagated from M1.

Why use two separate triadic memory instances? First, the triples to be stored would clash. (There are possible hacks to work around this and reduce memory usage, but that would make it even harder to explain.) Second, I want to illustrate the concept that this most basic circuit can be expanded with additional memory units to build a deeper, hierarchical temporal memory. I’m currently working on this, and would very much welcome ideas and collaboration.

My question to the neuroscientists here: If you rotate those diagrams by 90 degrees, will the dataflow between horizontal layers resemble what’s going on in the neocortex?

2 Likes

Sinus signal is much different than 60 digits of Pi, because you never get exactly the same floating values before
For HTM it is good Chance to learn the diversity and make it predictions more robustly

Yes, I know.
I guess one requirement is the consecutive encoded values have a good overlap.
Then there-s a “distance” parameter which controls how close a SDR have to be to an earlier instance in order to be considered a repetition of the previous SDR/value and assume its context for prediction.

I haven’t tried it since I’m not sure the python code I wrote is really what it should be.

It looks correct to me. Does it produce comparable results?

Let’s try with a sine signal! How does HTM encode real numbers to SDRs?

It can be tested by running python interactively:

$ python -i seq_learn.py
>>>
>>> roll('The quick brown fox jumped over the lazy dog')

roll() accepts two parameters - first is an input sequence and an optional second parameter which is the number of auto-predictions to make after the input sequence was fed into predictor.

The sequence can be any string (text between quotes) which is split in words by spaces and each new word is assigned a new random SDR in a dictionary. Previously seen words will retrieve their original SDR from dictionary.

Then the list of word-associated SDR list is fed into sequence predictor ( SeqPred.predict() )

roll() prints two outputs - first is a list of
(actual_input > predicted_input) pairs on the input string. An underscore (aka ‘_’) means empty/no prediction was found.

If a number of auto-fed predictions is specified then prints whatever sdrs (actually their associated words in dictionary) were self predicted, separated by ‘>’
The self prediction loop ends after either the prediction failed to retrieve anything or the specified number of predictions were found.
Predicted sequence-lines end with ‘.’

PS what should be mentioned is something goes wrong

>>> roll('The quick brown fox jumped over the lazy dog') 
rolling: "['The', 'quick', 'brown', 'fox', 'jumped', 'over', 'the', 'lazy', 'dog']" :
(INPUT,PREDICTED)... (The>_),(quick>_),(brown>brown),(fox>_),(jumped>_),(over>_),(the>_),(lazy>jumped),(dog>dog), .

>>> roll('The quick', 10)
rolling: "['The', 'quick']" :
(INPUT,PREDICTED)... (The>_),(quick>brown),
AUTO FEED: >fox>jumped>over>the>lazy>dog>The>quick>brown>fox .

As you see when it first sees the words makes strange prediction.

Second time I give it the first two words in sequence (with 10 auto-feeds) and it seems to work correctly.

What bugs me is the weird predictions made when first rolling a sequence