Why isn't HTM mainstream yet


Harsh. But also incorrect. There are various tricks for not getting stuck in poor local minima, and at least some of them we understand quite well. For example, applying momentum to your gradient updates is a pretty straightforward way to leap over local minima and it’s obvious why it works.

This is not to say we understand everything, but it’s clear there’s a lot we do understand (and some of it is pretty obvious).


And I think it’s not inappropriate to say that it’s not really the point of this thread to get into a religious war about details of what has or has not been proven (a lot has, a lot has not).

As has been said a few times, one very important reason people aren’t using HTM is that it doesn’t work better than other techniques. That’s all the reason people need to not use it.


@jakebruce, sure there are tricks, but gradient descent works surprisingly well even without any tricks! We do understand some things, obviously, but a lot of fundamental concepts are still a mystery. As (another) example, we have no idea why neural networks generalize so well (https://arxiv.org/abs/1611.03530).


How do we even know this is even true? My god the generalizations… It’s got 4,871 stars and 1,391 forks - I think it’s safe to say, err… SOMEBODY’s using it… You know… Tim - yeah THAT guy! :stuck_out_tongue:


@michaelklachko I didn’t spot it?

In all 3 of those quotes, my implication was that there are some things the human brain does wonderfully (i.e. creativity, generalization, abstraction, pattern recognition [at certain granularities ONLY], generating distinctions in reality etc.) - and then there are those that modern computers do better (i.e. calculating long lists of numbers such as a calculator, anomaly detection and pattern recognition [at certain granularities ONLY], calculation speed, inexhaustibility etc.).

…and that the traits in those sets of specialization have areas where they overlap and areas where they don’t. Also, the number of neurons in the neocortex come to around 19 billion, and so right now even if we had a complete model of the neocortex, we are still operating at a far smaller functional scale - and who can say, (at our current level of research), how large a system needs to be to excel at the same tasks current ML (some utilizing over 14 thousand GPUs) use?

…but you are absolutely right. The comparison between the two is meaningless (for the most part). And I would add to that, that whether we/they know it or not, and whether we/they like it or not - 10 years from now everybody will be doing some form of HTM. How we get there, in a certain sense, is really as immaterial as the ancient argument over whether to use a binary system or a decimal system to implement Von Neumann architectures - is today? :wink: [The drive-by evangelist strikes again! Hi Ho Silver… Away!] :slight_smile:


Everyone here has good points. The fact is that HTM is not mainstream, and honestly it doesn’t matter why, and it is not going to change the trajectory of HTM development. I appreciate everyone’s viewpoints, and there is no need to quibble over the details.

We are all here on this forum because we have an interest in HTM and want to see it progress. In that way, we can all agree (hopefully) that this continuing work is important and worthwhile to pursue.


Sorry Matt, but this is exactly the point why it does matter.
Some people think that HTM not popular because it does not work well. That means that it possibly has some major flaws. If they right, then maybe trajectory of development should be changed.
I love J.Hawkins and his work. All this years I see you trying to force community to use HTM techniques in different applications. I think that would also be helpful to force people for giving out some meaningful critics.



I truly believe that one day (I believe in the next 3 years or so?), HTM performance will so far outstrip any comparisons to ML that this conversation won’t be relevant. You have to remember that HTM Theory is not like NN-usage, in that it is plain as day where the innovation is going to come from. As more and more of the neocortex’s functionality gets added to the theory, it will become more and more powerful.

Conversely, NN development is a shot in the dark with stuttered stops and starts that last decades because there is no guiding model and no guarantee that it will ever amount to anything. This is because traditional ML research isn’t operating with a “guide” of sorts, while HTM Theory is! Again and again and again, this is the argument put forth by the biological faction - which is: "use the guide that Mother Nature has provided first, then improve upon it later TO PREVENT WASTING TIME!

Secondly, the argument over whether it works or not, or whether it works well is incidental and temporarily applicable. What does exist works… it works for what it does. Who’s to say when comparing apples and oranges that a certain amount of apple development should equal or surpass any particular amount of orange development?

The comparison doesn’t make sense, and the only thing that does make sense is that eventually when enough of the theory is complete, this will all become obvious to those who feel it is somehow substandard?

…just an opinion about one way or context to hold this stuff in, ok?


This is not to diminish the effort it has taken so far, and the tremendous leap of insight it took to comprise a working theory of, and discover the basics of - the current function of one of the layers of the neocortex. Further, I’m sure every push forward comes by virtue of a tremendous amount of effort. But what there is TO study and understand is right in front of us, thankfully.

(Please note: When I say “us” as if I’m actually doing something (which I’m not), I speak from the vantage point of humanity as a whole. I know there are specific individuals [i.e. research staff of Numenta and others both directly associated - and not.] who deserve a great deal of unrecognized credit for their undaunted efforts! So this is an acknowledgement of that!)


The best way to prove HTM to the mainstream is:

  1. benchmarks - which Numenta have recently created
  2. contests entries e.g. Kaggle that target time series data prediction or anomaly detection
  3. commercial products like Grok

All three of these methods are forced to prove out the value in a competitive environment, which is what our economic engine respects. No talk of science or theory needed.

I was on the Kaggle team that attempted this a couple of years ago, and I posed the same question at the time - what is the best way to prove HTM. We failed for several reasons and you can read into it what you want, but fundamentally our team was composed of people with varying degrees of knowledge of HTM and it’s operation and for the most part we did not know how to get results with HTM (2 years ago). Further, we did not gel as a team and none of us had entered a Kaggle competition (an art in itself). The deck was stacked against us. Just understanding and preparing raw EEG data for HTM consumption was a challenge. I would like to see a better prepared team, skilled and already functioning as a team and experienced with Kaggle, attempt this again today. It would tell us a lot about the real world possibilities for HTM.

I do believe in Numenta’s approach based on Hawkins insights and work, but if it does not scale up cheaply with modern hardware it will be seen as a toy. There is no easy way to accelerate the algorithm with GPU or map-reduce so it is mostly run on laptops with toy examples. I’m waiting for the day when acceleration comes in silicon either using custom chips or off the shelf neuromorphic building blocks such as memristor arrays. In the meantime all the glory will go to algorithms that can scale.


Imagine a time where there’s an algorithm that totally implements intelligent consciousness, but operates incredibly slowly given the day’s hardware. Would that even be a significant consideration or would the world be tirelessly working on a way to accelerate said technology? I think the value of Numenta’s approach transcends its ability to be used immediately…


I have a pretty good understanding of HTM theory, though I haven’t gotten around to working with NuPIC yet.

That being said, from what I’ve gathered, my guess is that HTM would probably get more attention if it were more useful. More specifically, if one could get more information out of it.

It seems as though most of the info that NuPIC can give you now is just anomaly data, by counting how often columns burst. That’s like having a massive deep neural network and only getting information on how many neurons are producing outputs above a certain threshold.

HTM is doing all this prediction, modelling, pattern recognition, etc., and all you’re able to get out of it is how anomalous the input is. The reason why everyone’s using deep learning these days is that it gets them that kind of information. It doesn’t matter if HTM’s models are a million times better; you can’t get very useful information out of them, and so no one will use it.

I think adding something akin to motor output to HTM would help this a lot. Some kind of reinforcement learning. It doesn’t matter if it’s based on real neuroscience or not, at least not at first. Something would be a whole lot better than nothing.


Dangerous idea. No need to rush, let neuroscience research lead the way.

HTM is a work in progress - Numenta are still just working on modelling the feedforward pathways of a single region. HTM will be far more pragmatic when multiple regions with feedback is implemented.


Hi @Charles_Rosenbauer,

You are also able to get multiple simultaneous predictions, with a probability score of what is most likely to occur based on spatial “belongingness” and temporal expectation/prediction.


I understand. Perhaps I worded things wrong.

What I meant was that it would be useful to get more detailed information on the pattern recognition, and it doesn’t have to involve simulating a whole motor circuit. That’s something that NuPIC already does. The brain doesn’t calculate and output anomaly scores.

@cogmission, thanks for reminding me. I forgot about NuPIC’s classifier. From what I’ve seen about how it works, it’s definitely not a biologically accurate method either.

What might work is having something like a small set of HTM-like neurons (simplified at least, maybe with just proximal connections), that get their inputs from another HTM region and can be trained via supervised learning. For example, if I had some HTM region learning data from video, and I wanted to know when a person was on the screen, I could have a neuron trained to recognize whatever representation it used for a person.

Sure, you can use other machine learning techniques to get that kind of information, and perhaps some that might be a bit better for that, but there’s of course other information that you get as well from NuPIC. Being able to get both without running multiple systems at once might make the algorithm a bit more attractive.


In my opinion, we’re talking about two different things :

1.- HTM Theory itself
2.- HTM implementations, as Nupic or htm.java.

So, I’ve only got a lof of questions:

Is HTM Theory harder to understand than maths behind ML? I don’t think so, HTM is more intuitive, you can explain it without any math. However I can find easily a lot of “low level” documentation about ML, while the HTM documentation is mainly exhaustive and detailed. Maybe a little hard for many people. When the book “HTM Theory for dummies” ?

Is Nupic a good implementation, not only from the point of view of the HTM theory adoption but anothers technical characteristics: user friendly, good documentation, modular, easy install, educational examples or similar? Have some clear metrics to check and compare accuracy of the network?

In my own experience a “welcome pack for beginners” would have reduced the effort to start. Not everyone can read the code directly. Maybe a visual build tool to put together sensors, models, classifiers?

ML have a lot of wrappers or interfaces, as Keras, you don’t need to know almost anything about tensorflow to use it.

Anomaly detection on scalar series is the best example to show the potential of HTM Theory? Instead, is it not a paradigmatic mathematical problem? Do you want to be compared with decades of math research at your first step?

As usual, excuse my basic english.


This will always be necessary because neurons don’t speak English. (yet) - The system will always have to communicate its knowledge to the outside world or have that knowledge be extractable from the outside world.


Surely having a server farm with 14K Gpus can’t be any more “convenient” then having a cortical model wrapped in an interpretive layer?


I meant for smaller scale stuff. The big companies might start considering NuPIC if the smaller ones and hobbyists start using it too. Most smaller companies and hobbyists can’t afford server farms that big.

The hobbyists, especially people trying to hack something together are probably going to want simpler solutions. Most people want pattern recognition out of their ML algorithms, so that’s what they’ll go for. Maybe a few more people will try NuPIC if it offers that, plus anomaly detection.

I recall someone running benchmarks a year or two ago that suggested that NuPIC likely is better at temporal modelling than LSTMs. Why do people still use LSTMs then? Because they can output more useful information. I saw a video the other day with someone using one for AI-generated youtube comments. Can NuPIC do that? Maybe someday, but certainly not in its current state.


Agreed - but I’ll bet a month paycheck that those comments are contextually nonsensical at best! But regardless, what we need is people like you in the community to make sure that we apply all of our rigor to the improvement of HTM Theory and it’s implementations - it will only make things that much better!