HTM vs Neural Network on building intelligent machine

Hello Fellows

I wanted to understand HTM better. So I searched information to compare HTM and neural networks (deep learning) on building artificial intelligence. Which one is better and why? Listing below is what I end up with. What is wrong there? What is missing? What else I should think about?

Pros

  • HTM is more accurate imitation of the only technique proven to be used to build up intelligence (brain).

  • HTM system should be simpler. HTM itself is complex, but in theory, one technique is enough for all the services provided by neocortex. Reusability makes it simple and scalable. The only way to manage to build intelligent machine with neural network is to mimic high level architecture of the brain: You need to make few basic building structures. By using copies of those structures you compose your intelligent machine. You cannot have highly customized and optimized neural network solutions(word2vec, seq-2-seq,convolution, inception,…) for every task you can think of. Complexity would explode.

  • HTM is naturally very noise tolerant. I have not yet focused on how neural network can be made noise tolerant. The key to HTM noise tolerance is sparse distributed representations. You can lose most of the data and add some noise and still have enough values bigger than threshold in right places to identify correct object.

  • HTM system could be fault tolerant. Building human level intelligence with current technologies will inevitably lead you to build huge system with many services. As anyone working with services knows, there will be all sort of problems. More services you have, more problems you have. If you have just huge system, then it may be that all the time there is broken service in your system. There are ways to get around this. My experience says, less you need to get around, more simple your solution will be. Simple converts to power and success in software development. HTM is very noise tolerant as mentioned above. Perhaps your HTM services would not need to have error handling mechanism at all for the case where some of the connected HTM services is broken. Broken service converts to extra noise.

  • Temporality is important component of recognizing real world patterns. Especially predicting what will happen next. HTM is about solving temporal patterns. HTM is naturally time-variant since it is stateful and context-dependent. Neural network is time-invariant. Same input produces the same output every time, so neural network input must be time-variant. There are really stateful and context-depended solutions for neural networks: LSTM is time-variant because its hidden state, but it uses time-invariant technique to find patterns in sequence. Here sequence can simulate discrete time steps. LSTM is technique under the hood of more advanced time-variant techniques like Neural Turing Machine or Neural Programmer-Interpreters. Neural Turing Machine’s memory is really stateful, since memory is always stateful. Write operation’s weights and erase vector change the memory state even if you save the same thing to memory over and over again? This makes it be very stateful and far from purely functional. Neural Programmer-Interpreters has call stack that is truly stateful and context-depended. I believe being stateful is important in finding time patterns. As a software developer, I know being stateful makes your code more difficult to understand and thus I dislike states. When I balance between simplicity and temporality, I tent to dislike neural network solutions more and HTM less.

  • When you go higher in the hierarchies of neocortex regions, the output region becomes more stable. HTM have similar feature as is described in Numenta’s white paper. The same should be doable with neural network, based on Schmidhuber’s Neural Computation 1992. After reading abstract of that study, it feels like Schmidhuber’s stabilizing features are much more complex in nature than with HTM.

  • It is natural for HTM to update its model online continuously without batches. Not saying this is requirement for intelligence, but this would make machine more agile. Being agile sounds intelligent to me.

  • HTM learns unsupervised.

  • I find HTM more interesting, so I am biased.

Possible Pros

  • HTM needs less data than neural network in some test settings. With this I refer to master thesis of Lomonaco Vincenzo. Lomonaco’s thesis is not enough to confirm me that HTM is inherently as good or better in finding fixed features than neural network. My gut feeling says HTM is worse to find fixed features in general.

  • Neural network has blind spots as described in a paper Intriguing Properties of Neural Networks. You train neural network with images, say cats and dogs. You do not need to modify your dog image much to make your well trained neural network predict it as a cat. To your eye, the original and modified images of dog are virtually identical. I don’t know if HTM has blind spots as well. Apparently humans do not have it. So this sort of blind spot is taken care somewhere between neural network and brain. HTM is somewhere there. My gut feeling is telling HTM’s noise tolerance plays role here.

  • HTM is possibly better in focusing attention. It is good in anomaly detection and finding the same object over and over again from noisy data. I would build attention mechanism on top of threat and reward and anomaly. I am not convinced that neural network is good in focusing attention. Attention should help you to allocate resources better, thus decreasing the need for resources. Some attention mechanisms I found increased the need for resources. This was the case with sequence-to-sequence variations. There are some interesting and potentially feasible attention mechanisms for neural network. For example The Recurrent Attention Model by Google Deebmind. But HTM is simple as mentioned above.

Cons

  • Looks like HTM shines in very distributed processor architecture, like in neuron networks. Optimizing HTM for GPU might be hard.

  • HTM is not mathematically as sound solution as neural network. For example there is no such thing as backpropagation for HTM.

  • The theory of HTM is still under construction and it may end up being failed theory.

  • There is only very few applications and developers and experiences and optimizations for HTM. HTM has not been reality checked. Neural network is reality tested.

  • HTM technique has not been used to win any major machine learning competition.

Possible Cons

  • If there is good quality data with clear and static patterns in it, then I would go for deep learning. For example deep learning may be better to recognize handwritten address from postcard. In theory HTM is happy to lose most of the data and get lots of noise on top of that and still predict well. It is my gut feeling saying that you must lose something in that process.
3 Likes

Thanks for this comparison. I think you did a good job with this. Here are a few of my thoughts.

I don’t believe this is true. Our current implementations are very crude and simple compared to what is actually happening in a layer of cortex, and we still don’t understand a lot of it.

Do you have a link to this work?

I don’t think I would assert that. We have not focused at all on how attention in the brain works, or how it will work with HTM. This is an unknown area, IMO.

Thanks for your opinions!

I did not read that. I read bit here and there and Conclusions. Not the best way to refer a paper, but felt like it was enough for me.

Too bad, but thanks for letting me know. People tend to be bad with complexity where they cannot just act based on intuition. Building software is one of those situations. I wish that HTM will become very modular solution with simple interface. It will be hard for people to scale it if the architecture looks like pictures of neuron mesh. Unless it becomes so unsuperwised that it learns architecture and how to create all connections by itself. That would be something!

I am sure that you know those issues better than me. It was on my mind so I said it :slight_smile:

1 Like

One thing to keep in mind is that there is a distinction between the user’s interface and the internal architecture of the technology. Regardless of how complex the internal pieces get, I’m sure they (Numenta) will strive to keep the interface to its use very accessible. The ability to scale the software module(s) is part of the design process, and I’m also confident that this concern will be addressed - as any good software is written with this in mind. The ability to combine discrete parts into larger compositional components such as building out a multi-regional HTM, seems to me to be one of the requirements one would have when engaging in the design of the overall HTM code. So I wouldn’t worry about this too much also…? :wink:

3 Likes

Kiitos tästä väitöskirjasta! And for an excellent conversation!

rich

1 Like

I’ve read a bit from Vincenzo’s PhD thesis, and have exchanged a couple of messages with him on LinkedIn. However, as I told him, I’m not pretty sure if his interpretation of HTM theory is similar to Numenta’s and from all you folks in this forum. First of all, his work is mainly focused on static image processing, that is, there is no notion of sequential data; at least not in what I’ve read. Second, there is no notion of sparse distributed representations, which is the essential data structure of HTM theory. If someone has found something relevant in his work, please let me know.

1 Like

How and why would he consider it HTM without a temporal (or at least a time-like) axis?

Without SDR? This sounds like HTM without any theory the same and without anything under the hood the same.

Yeah, I’m not entirely sure where HTM fits in there. I think his work is more related to the first generation of HTM algorithms; and his work is mainly focused on computer vision and how it is benefited by a hierarchical structure of processing units.

2 Likes

My neocortex did not predict that :slight_smile: