in a sense, most of those pairs have a 1x2 receptive fiield and a few jump across edges of the image
No every “black pixel” will have pairs with all other black pixels.
If pixels are vertices, bit pair expansion is (the set of edges of) a fully connected graph.
oh, wait, I completelly missed that part. now looking back, I see the P*(P-1)/2 equation.
I thought the bit pairs were being created in a single loop iteration
I was thinking it was kinda similar to an outer product binding but its actually identical to an outer product with zeroed diagonals lol.
curious that this is exactly how you compute the weights for an hopfield network, but you are using those weights as an SDR
It’s “only” a SDR in the case it is used directly as input to Numenta’s SDR Classifier.
In all other applications (associative memory, value map, id indexer) that expanded SDR represents a list of memory addresses where various “stuff” is stored.
PS Even in the case of SDR classifier there can be considered the bit pair expansion as an input layer of size N to a hidden layer of size N*(N-1)/2, where each node of the hidden layer connects with exactly 2 nodes in the input and computes function AND on its input node pair.
In this sense is a very sparse, fixed (untrained),first hidden layer
Right … if C can be computed from A and B, it carries no information and there’s no point in storing it.
But a memory is necessary if random SDRs {A,B,C} do carry non-computable information to be learnt, for example {sky, hascolor, blue}.
I just dont understand how are we treating SDRs here.
sounds like you mention as if A and B are bags of features (MNIST digits) but C is a random symbol.
I thought the sparse machinery were mainly used to process random symbols with minimal overlap.
Yes, I would be interested too. My understanding likewise is that SDRs are unique identifiers represented as sparse bit patterns, with the specific features of a degree of noise immunity and the ability to test for multiple SDRs at once using a suitable mask. If so one would be looking for algorithms Including sequence recognition) built on that data structure and most likely with a high degree of parallelism. If not, then what?
There are multiple 2D morphological transformation steps involved to map an MNIST digit to an SDR that can be processed by an associative memory. Those preprocessing steps should be fixed and deterministic, not requiring learning. Bit pair hashing with 2D topological constraints may prove useful, especially if combined with conventional image filters such as edge/gradient detection. As soon as you combine two such low level SDRs into a higher-level concept, random SDRs will come into play. Several steps up in the hierarchy of associative memories, you’ll have entirely random SDRs presenting abstract and stable concepts like “the digit 7”.
I thought the point was that if C is deterministic then the triadic memory learns the identity function and becomes “dead”
but I think that’s an issue caused by the diagonal addresses, if we set them to zero like you do in the bit expansion, there shouldn’t be a problem.
We talk in parallel here about different problems.
Digit recognition is one of them,
learning sequential patterns of either “symbols” or “data” encoded/represented as sdrs is another one.
Sure somewhere has to be a transition between sensory data to symbols and a machinery should be comfortable with both.
Beware the large language models (transformers) do not work with “pure” symbols, the are based on vector embeddings that encode some sort of “resemblance” between symbols.
the hard part is finding a way to encode “natural” sensory data as useful symbols,
if I got it right thats what you are trying to do with all those transformations and your benchmark for how useful those symbols are is the SDR classifier.
Yes the encoding is important. I think it is grossly overlooked in ML,
Google is trying to catch up, there is still active a Kaggle contest on making an universal image encoder, that doesn’t need further training so it produces a fixed size embedding which is useful in every task and more important is consistent which means there will be little incentive to further train a new “visual cortex” with every ML problem.
Another hard part to add to the above is figuring out what sensory data is useful.
Someone gave an extreme example. Imagine a pacman or any game in which the main character is represented, instead of an animated image, as a single red pixel moving around. What kind of algorithm would figure out on its own the respective pixel is the one that matters most.
Yes, that’s why Google’s contest has small chances to succeed. Generally when a company with Googles budget is shopping for $25k ideas on Kaggle, it means they ran out of them.
no someone’s partial joke on reddit. More reasonable question is how do we figure (search?) out what matters and what not in sensory data.
How big that pixel needs to be in order to get noticed.
Movement is a great clue.
Synchronization bettwen agent’s actions and pixel motion are also very important.
If I wiggle some muscles in a certain rhythm, what parts of the sensory stream oscillate in sync with them?
That’s why I thought content-neutral cycle sensitivity could be very useful.
anyway I think I managed to implement a somewhat limited version of the triadic memory.
it uses bits instead of bytes for the synapses so it uses 8x less RAM, but can only go in the A ⊗ B => C direction using a variation of the bit pair addressing.
do you guys recomend any test you have done for me to compare the performance?