@cezar_t @dmac @spitfire
While we’re on the subject of SDR classification, I want to reintroduce the classifier algorithm I developed several years ago that seems to be more performant than the regular linear classifier that is used in the HTM community. For lack of a better term, I will call it a Cortical Classifier, since it is looks like a stripped down Spatial Pooler that is augmented for supervised learning.
Types of Classification
Before I start, I want to clarify what I think are two different types of classification tasks. The first task is static classification, which is the task of receiving a single SDR code and classifying it with a label. The second task is dynamic classification, which is more of a sensorimotor process involving a series of inputs and a hierarchical cortical process that eliminates possibilities and predicts likely future inputs based on a classified model. This latter is what I think this keyboard stroke glyph classification problem is and it highlights the kind of problems that Discrete Cortical Circuits (DCC) inspired by the brain are likely to solve. However, the Cortical Classifier I present here is just a static classifier which may be of some use in the future when solving dynamic classification.
Demo
To start, you can see the Cortical Classifier in action using my DCC dashboard:
If my domain is white-listed, you should be able to see the demo embedded below. Otherwise, just use the above link.
The demo shows three blocks: a discrete data source that loops through a sequence of integer values, a discrete encoder that transforms discrete values into an SDR encoding, and the third is the Cortical Classifier that takes in the SDR input from the encoder and the discrete training signal that specifies the proper label for each SDR input.
The block for the Cortical Classifier has two activity plots. The top part is the probabilistic interpretation of which label to classify the input, and the bottom part is the cell activations that vote on which class label the current input should be assigned to.
How It Works
The classifier cells are just normal HTM-like cells with synaptic inputs with binary weights and permanence values applied to the input field. The classifier is parameterized with n number of cells and m number of labels. For each class, l number of cells are assigned for detecting that class. Ideally, these cells specialize to detect different aspects of a class if that class has multiple possible inputs (this is not the case for the demo). You can interpret the cell activations as a probability vector over the classes by, for each class, divide the number of class cells activated by k.
I said originally that the Cortical Classifier (CC) is like a stripped down HTM Spatial Pooler. You can imagine the CC as being a spatial pooler without the topology, without the neuron boosting, and without the ability for neurons to activate of their own volition. Instead, activation of the cells is done with a k-Winner-Take-All algorithm applied over the whole field which guarantees a fixed k-sparsity.
The use of k-WTA seems to be the natural mechanism for activating a layer of neurons since it efficiently mimics the effect of lateral inhibition. However, since we don’t use topology, (which would inhibit cell neighbors), we can easily visualize what is happening by clustering similar cells next to each other just for the purpose of visualization. You can see that effect in the demo as the classifier fully learns to classify around step 60.
The other major aspect of the CC is the learning mechanism. Unlike the HTM Spatial Pooler which is an unsupervised learning algorithm, the CC is a supervised learning algorithm with a training signal that instructs the neurons whether their activation is appropriate. When a neuron activates, it either reinforces its synaptic permanences if the activation was correct or it punishes its synaptic permanences if the activation was incorrect. It does not, however, apply changes to neurons if they don’t activate, even if they were supposed to. I did some experimentation and I found that this was the quickest and most stable way to create a classifier across a wide variety of circumstances. This was just based on experimentation and intuition, and I didn’t formally codify this approach, so there may be some variations that are more appropriate in different situations.
One final feature that does not exist in HTM is the ability to prune and grow synapses. For each cell, they have a finite number of synapses that are a fraction of the total input field size. Each synapse has a permanence value which is an integer. If the integer is above some positive threshold, the synapse is connected and passes its input value to its connected neuron for summation. Unique to CCs, however, if the permanence goes to zero, the synapse is pruned (i.e. deleted), and a new synapse is grown (created) with a randomly chosen input location. This synaptic prune-grow strategy allows the algorithm to efficiently allocate resources and enables neurons to adapt and specialize to the structure of the input they receive.
You can review an older implementation in my old BrainBlocks library from 4 years ago.
In particular, you can view the cell activation step here and the learning step here
The synaptic prune-grow code is in the BlockMemory class. However, looking at the BrainBlocks code, I see we had disabled the prune-grow capability at the time. You can see still see the prune-grow code in the learn_move function of the BlockMemory class. I do use the prune-grow functionality in my latest DCC library run in the demo. I haven’t shared that code yet.
I just thought I would share this because I was surprised how well it performed once I made the desired algorithm HTM-like instead of using a classical ML algorithm.