I’m writing my own implementation of a union pooler/temporal pooling algorithm to learn to represent currently active sequences in a longer-term context. I’m adapting the code from the nupic python library from this union pooler implementation.
I’m trying to write the union pooler to work with existing code from this repo: bitHTM. It removes a lot of the details like local receptive fields and other stuff to simplify and speed up the code (I’m also using cupy instead of numpy so that it works on a GPU, but you can replace cp with np and its the same). The spatial pooling/boosting mechanism is implemented as a vectorised matrix operation from this repo, so I’m pretty sure the weight update rule is correct.
Here’s my implementation. I just wanted to ask if anyone could potentially help me check that I’m implementing it right, because it seems to be broken.
So firstly, I wanted to create a bunch of synthetic data to test the model with various parameters. The data is basically 10 patterns each with 10 time sequence steps, which are binary arrays of length 500, just randomly initialised. I choose a pattern randomly and train the model on the 10 inputs in the pattern one-by-one, then go again. I was worried that they don’t preserve any semantic information between bits, is this crucial for the union pooler to work? It seems to lean patterns with SP/TM basically perfectly, predicting every column after enough iterations, and predicting the inputs between patterns very badly (usually 0 or less than 10 correct, as you’d expect) meaning when pattern 1 ends it cannot predict the next one because it could be pattern 1,2,3,4…etc.
The problem is that my model eventually reaches just one stable representation that never changes between patterns. Ideally, I should expect the model to represent the currently active pattern, then as soon as it sees the next one, it should rapidly update the entire context to represent this, and not change much during the course of the pattern. In practice, I can see the change in the union_SDR between each step, and it doesn’t show this behaviour at all, it seems to change almost randomly during the course of the pattern, sometimes changing a little at the start, sometimes changing a lot mid-way through. However, when I plot the sum of each unionSDR over each type of pattern, they’re all relatively distinct, meaning some bits have high activation for one specific pattern, and zeros everywhere else.
I understand that this is probably not the most rigorous analysis, but is there anything specific about the kind of patterns it can match, or any issues with my implementation which may be at fault? Is there a more updated model or pseudocode which I could use instead? I’ve tried a bunch of different learning rules, and removing each of the existing rules but nothing seems to form stable yet diverse representations for each of the 10 patterns.
I would be extremely grateful for any ideas about what could be wrong, thank you