I’ve always had this question in the back of my mind, “What makes HTM different, and in what ways superior, to classical NN or deep NN ways of learning?”
I thought about it today and wanted to check my intuition against the group here at HTM forum.
I thought, well, I don’t know much about backpropagation but it seems like during backpropagation weights can be modified in a way that can destabilize the net. HTM doesn’t seem to be susceptible to this problem.
Key Pixel Problem
It’s called the ‘key pixel problem’ and it affects Generative Adversarial Networks in a big way. As far as I understand it, its one of the reasons their training is harder to converge. Anyway, if you haven’t heard of this problem let me try to describe it in my amateur way:
Neural networks are prone to unintentional fooling. This is currently not seen as an inherent flaw in the design principles of deep neural nets but instead is seen as an inevitable outcome of any dynamically chaotic system. Therefore the solutions to solve this problem are applied after training, current solutions are not changes to the algorithms themselves.
- key pixels (overfits on some pixels that follow a simple rule in all examples (by chance))
- opaque (attention problems).
- small effects can be disproportionately weighted.
Obviously, I’m of the opinion that the key pixel problem is inherent in the algorithm itself. It’s something to be fixed not something to be defended against after the fact. Of course, we should put things in order before they exist.
HTM and the Key Pixel Problem
Anyway, I began to wonder, how does HTM solve this (assuming it does as we humans, at least on the lowest layers of perception are not susceptible to this problem).
Perhaps someone here can give better intuition than my own, as I don’t know DNN or HTM as well as you guys. But my initial intuition says HTM isn’t susceptible to this because of the layer one feedback mechanism. That is, layers above send down a representation of the current higher-level context in that they tell the lower levels a union of lower level perceptions that the lower level can expect to see next.
My feeling is that this union of patterns, (which is really a union of sparse unions, given that each layer below receives data from multiple layers-neighbors above it) associates these patterns together in the same context. Therefore, it seems to me that its a constraining mechanism in the analogous DNN backpropagation.
In other words, it seems like DNNs need another component to their backpropagation that uses the patterns (information) contained in higher level concepts to constrain change to a fuzzy set of nodes in the lower level layers.
Anyway, I know this might be confusing I’m speaking at the edge, and surely beyond my understanding in both HTM theory and DNN theory. So I might be radically misinformed, or what I’m saying might be pathetically obvious to everyone else, I don’t know.
So I’d like some feedback, is this right? Is my intuition at least on a conceptual level correct? I can’t think in math, but I can think in cause and effect, in algorithm. What do you think?