Criticism of HTM theory


I got into a discussion of HTM on Nengo forums, however I don’t know enough to address their points:


  1. All of the inputs are binary vectors. Similarly for connections, if the ‘permanence’ value of a synapse is above a threshold, the connection is assigned a value of 1, otherwise 0. As actual system inputs and neural connections are definitely not binary, this makes it pretty hard to model in a biologically plausible neural network.

  2. In the second step of their spatial pooling (Page 35, Phase 2) they find the k most active columns, to apply learning to only these columns. Dynamically, setting up WTA with lateral inhibitory connections is notoriously very tricky, and isn’t something that can be done in a single time step. On top of that, controlling the learning so that it’s only applied after the network has settled on a set of winners is a whole other issue. It might be the case it works running the whole time as the WTA circuit settles, but the dynamics are complex and can’t just be assumed to work.
    (Side note: robust WTA can be implemented using subcortical circuits, e.g. the Nengo Basal Ganglia model, which is an implementation of a circuit detailed by experimentalist work from Gurney, Prescott, and Redgrave)

  3. Another potential issue is their distinction of activation due to lateral or feedforward connections. They put a neuron in a ‘predictive state’ if it receives activity over a lateral connection, and an ‘active state’ if it receives activity over feedforward connections. I’m not sure what this translates to in an actual neural model. Maybe lateral connections modulate the STDP from feedforward connections. But it’s another thing that gets glossed over by not ‘getting bogged down in details’ and would need to be sorted out in a more biologically constrained model.

Does anyone want to comment (here or there)?


I am certainly not an expert, but I can post some relevant information to a couple of these questions.

Jeff posted some relevant details related to this on another thread:

Wikipedia has an article about depolarization. The idea is that there is a large impact on the neuron, but it doesn’t actually fire. I am not a neuroscientist, though, so someone else may have a better explanation or resource than that.

To pull out what I see as some of the more relevant details for understanding the concept of “predictive state”:

Once the stimuli have reached the cell body, the nerve must integrate the various stimuli before the nerve can respond. The stimuli that have traveled down the dendrites converge at the axon hillock, where they are summed to determine the neuronal response. If the sum of the stimuli reaches a certain voltage, known as the threshold potential, depolarization continues from the axon hillock down the axon.

This first point indicates that depolarization of the cell must reach some threshold level before it will transmit an action. Depolarization can happen without the neuron firing.

Next, from the linked article on Action Potential:

neurotransmitters then bind to receptors on the postsynaptic cell. This binding opens various types of ion channels. This opening has the further effect of changing the local permeability of the cell membrane and, thus, the membrane potential. If the binding increases the voltage (depolarizes the membrane), the synapse is excitatory. If, however, the binding decreases the voltage (hyperpolarizes the membrane), it is inhibitory.

I think understanding that hyperpolarization is inhibitory and depolarization is exitatory, and pairing that up with the idea that depolorization is not a boolean event, I think it is easy to understand a state in which a cell has been depolorized, but at a level which is below the threshold – in this state it is “primed” to fire. This is what is referred to as “predictive state”.

  1. Non-binary weights may be helpful but not essential. Since neurons learn patterns and fine-tune their synapses to fit that pattern, if the binary weight approximation has an impact, they’ll just grow or lose synapses, or represent another equally valid pattern.
    HTM is still biologically plausible, since binary weight and scalar weight are both consistent with the same theory. You could add in a bunch of biological features and there’s no reason the same theories wouldn’t be applicable.

  2. There are so many ways that SP could work in biology that it’s a good guess. There is evidence for sparse firing and pattern preferentiality, and that’s all it needs.

  3. Again, there are lots of ways to do sequence memory, since it just requires activity the previous moment to bias activity in the next moment.

Not getting bogged down in details is the only option to create a comprehensive hypothesis. There are way too many details to worry about those which likely have small impacts and unknown function. Neuroscience is mostly a failure when it comes to theories, and it’s easy to make passing interpretations of some specific findings, but very hard and inefficient to conclude the precise role of a detail, much less how those roles combine. There are too many factors and inconsistencies to make anything but an educated guess. To reach a comprehensive theory of intelligence, an evolving tentative theory is required.