SDR classification

Neat! I tried to do this same thing using htm.core, but I never got it to work…

2 Likes

@dmac Yeah, the problem was that the HTM spatial pooler had too much stuff going on with it. It was designed to be a neuro-simulator, not an ML algorithm.

While trying to create my own version, I stripped out things I didn’t understand, or couldn’t make work reliably. I then had my own lean simplified version, which let’s call a Cortical Pooler (CP), that did a great job of learning a mapping of one set of SDRs into another set of SDRs with the same “like-inputs creates like-outputs” property that the HTM SP has. However, it didn’t have the lesion resilience properties or the spatial topology.

From that starting point, it was quite straight-forward to convert it from an unsupervised learner to a supervised learner. The only challenge was deciding what the reward-punish learning protocol was going to be. Only modifying synaptic permanence when a neuron activates, and not when it should activate was another breakthrough that enabled learning without catastrophic forgetting. Since only a fraction of the neurons activated at a time, only a fraction of the neurons had their permanences changed, preventing any previously learned patterns from being forgotten.

I’m honestly surprised people have not been using this more widely. I showed it off many years ago, but no one seemed to pick up on it.

1 Like

htm.core inherited Numenta’s topology research which wasn’t very practical. Instead it was focused on scientific properties (like lesion resistance).

In my experience playing with HTM’s, the best and most practical way to implement topology is to stitch together many instances of the local/non-topological model into a regular grid.

1 Like

Brilliant. Naive in my experiments, I think I tried to force cells to activate, out of fear that without this type of supervision they never would activate correctly. But now that you say it works, it seems plausible that by simply unlearning the initial responses, the correct responses would eventually come up.

1 Like

@dmac I think other issues of htm.core are low Computational speed so that users do not like to create a big network with many hierarchical levels!

1 Like

@jacobeverist thanks for sharing your interesting works. Do you have any experiment with very long sequences ?

1 Like

@thanh-binh.to sorry it took so long to respond. I ended up having to do a lot of refactoring and updating my websites so that I can share my simulations easily in the future.

Here is an example of the Cortical Classifier setup for 4 states, 8 states, and 16 states respectively:

The biggest difference between them is that it takes longer to learn for the 16-state classifier. You can speed up the learning by setting the sim speed to max and then you will see it eventually converge.

An interesting experiment I could try is to see if the training time scales linearly with the number of states or if more states to learn has a nonlinear effect. I suspect the latter but not what its magnitude might be.

Other things to try would be varying the total cells assigned to represent a state, as well as varying the total number of cells that activate.

For instance, we could have a classifier setup for m=2 states. Each state is assigned s=8 cells so the total number of cells in the classifier is n=16. We can then set our total number of activations to a=\{4, 8, 12\}. That is, 3 different possibilities.

If a=4, then we are under-activating, only setting 50% of the cell bits assigned to a state.
01101010 00000000

If a=8, then we are fully-activating, setting 100% of the cell bits assigned to a state.
11111111 00000000

If a=12, then we are over-activating, setting 150% of the cell bits assigned to state, inadvertently activating cells assigned to other states.
11111111 00100111

It would be interesting to see what the consequences of this will be. My intuition is that under-activation would be preferred because it would make the state classifiers more robust, preventing the cells from over-fitting, and enabling the classifier to detect a wide variety of different features that would be assigned to that class state.

1 Like

Hi everyone. So I’ve been quietly playing around with this problem for a few days.

I decided to change my approach. Rather than identify entire words, I’m identifying key intent - Did the user mean to swipe that key, or are they just passing through.

The intuition here is that your swipe changes when you reach a key. You change direction or speed and move on to the next key. Few keys in most words are in a linear line to others, they’re deliberately all over the keyboard.

So now I’ve changed to doing a binary classifier a simple “register key” or not.

After doing that approach with a simple SP I’m able to get a 66% accuracy rate in keys.

So now my problem becomes how do I optimize the SP parameters to improve this rate. Here’s my parameters:

"potentialRadius": 7,
"boostStrength": 7.0,
"columnDimensions": (79, 79),
"dutyCyclePeriod": 1402,
"localAreaDensity": 0.1,
"minPctOverlapDutyCycle": 0.2,
"potentialPct": 0.1,
"stimulusThreshold": 6,
"synPermActiveInc": 0.14,
"synPermConnected": 0.5,
"synPermInactiveDec": 0.02,

Working on a classifier for binary images (say 100x70), are there any obvious parameters to play with? 
1 Like