No repeating vectors(SDRs) in the list. Chances of collisions are low, but still present. My code found those and dropped them - they did occur. Probably just weak rand generators
I’m curious if you can find duplicates in 1M consecutive samples of randomSDR()
I looked at your code and it does indeed has to check no two bits are repeated in the 10 bit list. But randomSDR which shuffles 1000 consecutive values cannot produce repeating bits in the first time
Just to clarify, we are talking about the result of multiple calls to randomSDR() not ever returning the same value, which relies entirely on the rand API call underneath.
Without looking at it, would you take a bet? Even if underlying api repeats itself, the new shuffle is a new permutation of the previous shuffle. That’s quite tricky to get duplicates even with applying the same initial position permutation for 1 million times.
Either way, betting aside for the moment, here’s the output from the sdrsdm implementation.
It is a lot faster (the new code), but visual inspection says <<100k. It was actually 16870 for the first failure to store and perfect retrieve.
It could be that we are doing different tests and finding different things. Maybe we should start there?
Hmm I thought I already told you the upper limit in your random sdr generator should be 1000 not 256? You test for 10/256 SDRs not 10/1000
Ok, that would make a lot of difference. Looks like my test is wrong. I’ll try and re-do.
So, with a proper SDR space (and hence no collisions either):
Glad to be finally back on the same page
I had to disable JIT because it seg faulted (assume RAM limit?). Without JIT it sits at ~1.2G fine, but so slow. I’ll play with that separately. This box has plenty of RAM head room, so it was a surprise…
I’ll run larger tests to get the same shape back (presumably) but at much higher scales.
You were right.
I’ve also updated the github code.
Random thought to keep in mind, if code is compiled in 32bit, you’ll still be capped at 4GB ram regardless of the host.
Oh thanks.
Regarding:
I had to disable JIT because it seg faulted (assume RAM limit?).
Can you tell me what version of OS/python/numba/numpy you use? I knew the numba can be finicky I’m curious in which contexts
If on linux for numba and numpy you can run
$ pip list | grep num > versions.txt
PS
It was actually 16870 for the first failure
According to the author, capacity starts to decay around (N/P) ** 3 entries. Interestingly (256/10)**3 is 16777.216
numba 0.55.2
numpy 1.21.2
Python 3.9.2 (default, Feb 28 2021, 17:03:44)
[GCC 10.2.1 20210110] on linux
Or 1m at 10/1000. Looks a little low on this run, but I guess you’d need a distribution to really know.
@POv Peter,
Is it possible to delete values after a write?
Sure! Change + to - in the write function.
I posted a neg_triadaticmemory.c version, that uses -{x,y,z} as delete syntax, that does this, and it works cleanly.
Perhaps the library should have delete functionality too? - the python version is easy to modify.
Here’s some output:
./neg_triadicmemory 1000 5
{1 2 3 4 5, 6 7 8 9 10, 11 12 13 14 15}
{6 7 8 9 10, 11 12 13 14 15, 16 17 18 19 20}
{11 12 13 14 15, 16 17 18 19 20, 21 22 23 24 25}
{1 2 3 4 5, 6 7 8 9 10, _}
11 12 13 14 15
-{1 2 3 4 5, 6 7 8 9 10, 11 12 13 14 15}
{1 2 3 4 5, 6 7 8 9 10, _}
{1 2 3 4 5, 6 7 8 9 10, 11 12 13 14 15}
{1 2 3 4 5, 6 7 8 9 10, _}
11 12 13 14 15
-{1 2 3 4 5, 6 7 8 9 10, 11 12 13 14 15}
{1 2 3 4 5, 6 7 8 9 10, _}
{_, 16 17 18 19 20, 21 22 23 24 25}
11 12 13 14 15
{6 7 8 9 10, 11 12 13 14 15, _}
16 17 18 19 20
{1 2 3 4 5, 6 7 8 9 10, _}
quit
Interesting to see how all triples are independent so full delete in a stitching context will require per-position delete.
Yeah delete is a plus. It allows preventing errors in continuous learning systems by keeping a “safe” amount of records into memory
I added DiadicMemory.remove(x,y) and TriadicMemory.remove(x,y,z) in sdrsdm.py
It check for underflow aka decrementing only non-zero memory locations.
overflow at store() is still unchecked. I suspect normal usage, where overlap between data samples is high or e.g. an agent repeatedly encounters the same context/state, would need an overflow check too
@DanML can you share/send me the code that coredumps? I have relatively close setups and both *_test.py run fine
Python 3.7 and 3.8
numba 0.55.0 and 0.55.2
numpy 1.19.4 and 1.22.4
diadic_test.py uses a bit under 700MB
machine is an intel 64 bit ubuntu 20.04.4 LTS
The larger run with failure thresholds.
The centre of the 10/1000 shape is somewhat different to the above (10/256) graphs, in that even at 10m records there is still some chance of getting zero errors, but also some chance of complete incoherence (no signal left).
Should be a way of calculating probabilities - I’ll have a look - unless someone else already knows?
Good idea, but does it add significant performance overhead?
Perhaps provide two calls:
- Std - which uses under/over checks
- _raw - which doesn’t and let’s the caller take the risk