Triadic Memory — A Fundamental Algorithm for Cognitive Computing

cezar_t · June 2, 2022, 12:16pm

@POv - i think there-s a very simple way to overcome memory size limitations.
“Canonicallly”, for e.g. DiadicMemory, one needs to allocate a 2D byte array of N*(N-1) rows and N columns.

During mem.store(X, Y) - the bits of X are expanded into P*(P-1) row addresses and in each row Y is added as 00010011 … bits to the respective row.

One problem with this is for large-ish SDRs e.g. for 10kbit SDR one would have to allocate 0.5 trillion bytes of memory.

There is a simple trick to allocate an array with as many rows as you like: simply divide each row address modulo number of available rows.

Yes errors will happen sooner but:

user has the option of balancing/negotiating between memory space and store() capacity.
with a flat, 1D array (as in the C code) there is virtually no restriction in SDR length. They can be e.g. P=10, N=100000 long, and semantics becomes very rich, because each bit can represent a certain unique property.

@DanML

I had numba core dumping on me without other messages while rewriting an older similar module

I tracked it to the fact I had inadvertently accessed numpy array out of its range (addressing mistake).

POv · June 2, 2022, 12:43pm

That’s a brilliant idea, Cezar.

cezar_t · June 2, 2022, 1:03pm

Well, it might be.

One can imagine a “global” space of 1M bit SDRs and

can create arbitrary amount of smaller (1-100kbit) sub-domains allocated randomly to represent each … whatever. Speech, Chinese, math, feeding, home town, Ukraine, chess, driving and so on.
not only they will seldom collide (since occasional one-pair collision between two different 1% narrow spaces is quite unlikely and low relevant) but it allows cross-domain correlation SDRs (contexts/concepts/ideas) whenever and wherever they are needed or appropriate.

POv · June 2, 2022, 1:18pm

Maybe this is how the brain works…

cezar_t · June 2, 2022, 1:56pm

well… some sort of equivalent mechanics for “wide spreading” and retrieval of “knowledge” across a huge space makes sense.

I would wager that for every bit out of the very large N there-s also a small processing unit (could be a cortical minicolumn?) which makes predictions by looking at the rest of active bits within its context SDR.

cezar_t · June 3, 2022, 7:56am

I got a bit carried away - I thought it could discard the need for a fixed N. It is needed to sum() all retrieved “rows” - N has to be the row length.

cezar_t · June 4, 2022, 10:45am

I tried this - Diadic is fine since it uses 2D memory array.
Triadic however - with flattened 1D storage (as in C) gets excruciatingly slow.
Even the cubic 3D TriadicMemory gets very slow when searching for X instead of Z

PS currently pushed sdrsdm.py can’t retrieve X and Y at all, due to misspelling a function call.

cezar_t · June 27, 2022, 10:39am

While thinking of scaling TM I figured out an even simpler trick can be applied to the small family of Di-/TriadicMemory and my SDR Indexer.

Here-s how it works:

assume a 10kbit TriadicMemory. That’s 1000 x bigger and slower than a puny 1kbit one, a whooping 1TByte, and at 1% sparsity each search/update accesses 1 Million locations (bytes actually, that’s 1M/8 memory transfers)
we get ambitious & want to spread it on 10 machines.
For conceptual simplicity, assume there is a Dispatcher node which is the API front-end receiving storage & retreive requests for full 10kbit SDRs
to recall a TriadicMemory storing (X,Y) → Z is simply cube matrix with dimension NNN where N is the encoding size of the SDRs it operates with.
Now, every node out of 10, instead of storing the whole cube it stores only a slice of it. Let’s say it slices it by Z.
Each node, instead of storing 10kbits of Z, stores only 1kbit. Node 1 stores first kbit, node 10 the tenth. As easy as that.
When querying the “cube” for patterns X, Y: each node restores its own 1kbit of response, sends it back to the Dispatcher node which simply concatenates the 10 responses to restore original Z
the slicing can be done on any axis, not only Z.
the difference when slicing by e.g. Y, is every node instead of resolving a 1kbit of the response, it computes a 10kbyte partial response
which means the Dispatcher node has to receive and add up all 10 x 10kbyte partial responses in order to restore the original Z. Which is sucks up ~100x more bandwidth than receiving 10x1kbit responses to concatenate.
That doesn’t mean slicing by Z is the best thing to do - Because each node has to do its own 10k x 10k address decoding and at 1%s parsity means it has to retrieve from memory 10k x 1kbyte (=1Mbyte) of data / query which still might be prohibitive. while slicing by X, Y or both diminishes computing/memory bandwidth demand per node proportionally with the number of slices in XY plane.
so probably the best way to optimize the whole behemoth is to slice in two stages - first stage by Z and every slice cut in 4x4 “cubes” in XY plane, each with its own node.
Every slice in Z has its own sub-dispatcher that’s a front-end to its 16 node cubes.

It sounds massive (and it is) but it is conceptually/architecturally very simple and what is most beautiful about it is there-s full flexibility in scaling both by node capacity (compute, memory size & bandwidth) and network bandwidth.

Imagine a 512 bit AVX cpu adding 64 bytes at a time per clock.

POv · June 27, 2022, 12:42pm

This seems like a good approach to scale up Triadic Memory.

Incidentallly, I just pushed a memory-optimized C implementation of the Dyadic Memory
algorithm (effectively a Kanerva SDM) which works for dimensions up to 20,000.

github.com

PeterOvermann/TriadicMemory/blob/main/sparseassociativememory.c

/*

sparseassociativememory.c

A Sparse Distributed Memory for binary hypervectors with dimensions up to 20,000,
based on the Dyadic/Triadic Memory algorithms published in the 2022 paper
https://github.com/PeterOvermann/Writings/blob/main/TriadicMemory.pdf

Copyright (c) 2021-2022 Peter Overmann

Permission is hereby granted, free of charge, to any person obtaining a copy of this software
and associated documentation files (the “Software”), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,
and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial
portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT

This file has been truncated. show original

cezar_t · June 27, 2022, 1:34pm

Interesting, it looks like it is dynamically allocated?

POv · June 27, 2022, 1:40pm

Right, here the hard storage locations are dynamically allocated sparse data structures.

cezar_t · June 27, 2022, 1:43pm

If you pull out of main() two C functions - one for writing and one for querying probably more people would get interested to play with it as a library instead of command line which for large tests gets tedious and cripples its performance

rogert · June 29, 2022, 10:09am

Scheme translations of triadicmemory.c and triadic_test.py: triadicmemory.ss, triadic-test.ss

$ triadic-test.ss x       
Testing TriadicMemory with 100000 SDRs of size 1000 and solidity of 10 ON bits
100000 vectors generated in 145ms
100000 triples stored in 1237ms
100000 queries in 8826ms

Query speed is similar for {_,y,z} and {x,y,_} - about 12K/sec; {x,_,z} is about 4K/sec

cezar_t · June 29, 2022, 12:30pm

That’s cool! I can’t talk scheme but the short explanatory text of the algorithm is quite well written too.

POv · June 29, 2022, 3:09pm

Great to have a Scheme version of the algorithm! The timings look really impressive.

POv · July 1, 2022, 7:16am

I made some modifications so that the same code can be used as a library or command line program.

LiorA · July 1, 2022, 11:07am

a quick implementation in Julia … results not that great but I’m still working on it. GitHub - alior101/TriadicMemory: Triadic Memory Algorithm (pull request sent to POv)
If there are Julia master here - please comment… I’m seeing an unexpected high number of memory allocations on querying.

julia> include("triadicmemory.jl")
inserting 10000 vectors
preparing 10000 vectors took 0.467523515 seconds: 21389.3 per sec
inserting 10000 vectors took 0.130208133 seconds: 76800.12 per sec
  5.879034 seconds (3.47 M allocations: 2.144 GiB, 3.52% gc time, 4.33% compilation time)
querying z 10000 vectors took 5.879033064 seconds: 1700.96 per sec, 0 errors
  5.884058 seconds (2.86 M allocations: 2.111 GiB, 1.55% gc time, 2.31% compilation time)
querying y 10000 vectors took 5.884057576 seconds: 1699.51 per sec, 0 errors
  0.657309 seconds (2.86 M allocations: 2.111 GiB, 11.48% gc time, 19.37% compilation time)
querying x 10000 vectors took 0.657307676 seconds: 15213.58 per sec, 0 errors

POv · July 1, 2022, 11:34am

Happy to see a Julia implementation … thanks Lior!

LiorA · July 2, 2022, 5:41pm

Can you please give more details on the " In the default nybble mode both xyz counts and zyx counts are stored in each byte." trick in your scheme implementation ? I’m curious to understand how your code achieves better speed than C …

rogert · July 2, 2022, 8:47pm

When storing {x,y,z} the standard nested loops (do x… (do y… (do z…))) are followed by (do z… (do y… (do x…))) incrementing the top 4 bits (“nybble”) of the byte: store code (lines 177-199)

Then to recall x {_,y,z} corresponding (do z… (do y… (do x…))) loops are used (lines 257-265). The indexing calculations use Scheme R6RS Fixnum operations, which Chez Scheme (with appropriate compiler options lines 81-91) produces good code for.

Topic		Replies	Views
The power of bit pair expansion Tangential Theories	40	1244	September 13, 2022
The Principle of Temporal Memory Numenta Theory	21	821	April 25, 2021
Most frequent SDR overpowering all other bits Engineering encoders , question , community	21	668	April 15, 2024
Implementing SDR long term memory? Implementations memory	9	1664	March 6, 2022
Associative Memory via Predicitve Coding Lounge	25	1697	November 2, 2021

Triadic Memory — A Fundamental Algorithm for Cognitive Computing

Related topics