I've implemented HTM to run on GPU with PyTorch and made it to generate text

Today, HTM has a few problems that limit the degrees of freedom when it comes to building real-world applications.
So I have developed bitHTM to address these issues.

The first problem is the time it takes to run the algorithm. This aspect of the problems has been and currently being resolved by many people from this community. One instance is Etaler by @marty1885. He has used OpenCL to implement HTM that’s optimized to be fast and well suited for building applications. I, on the other hand, implemented bitHTM with NumPy initially and had it ported over to PyTorch, a machine learning framework that’s poplular among the deep learning community. (PyTorch can run on GPU and has a very similar API to NumPy’s.) And I tried to stick to the original algorithm of HTM as well.

The second one is that it’s hardly compatible with deep learning systems. The HTM algorithm consists of binary operations which are not differentiable thus forbidding the backpropagation algorithm to work. This has been somewhat addressed by DeepHTM which was developed by me. I had this in mind when I was implementing bitHTM so I could (relatively) easily modify it to turn it into DeepHTM. This would allow all kinds of applications such as using HTM on videos. You can hear about it in a bit more detail here.

The implementation of bitHTM is highly parallelized so it could run more efficiently on GPU as well as multiple CPUs. To test its performance, I’ve ran an experiment to make it to generate text.

The dataset I’ve used is from blog posts.
First, I made every text lowercase and I’ve extracted 5000 most frequent “words” from it and only kept the posts that only consist of the words I’ve picked. The “words” here are either the ones that are separated by spaces within the text or punctuation marks. I did this to encode each separate words to feed into HTM.

The encoder I’ve used is not hard-coded but rather trained by backpropagation. It’s a CBOW word2vec network I’ve trained with PyTorch with a loss function to enforce sparsity. You can take a look at the code here.

Then I fed encoded word representations to bitHTM like you’d do with any other HTM systems. It was trained on GPU by Google Colab. I fed the total sequence of 1185255 words it has never seen before which took over 5 hours. It ran with 16ms per step on average.

To decode it’s prediction to a word, I’ve made a decoder which yet again trained by backpropagation. It’s a multilayer perceptron with 3 layers which use leaky ReLU as the activation function. It receives active cells of HTM as input and outputs probabilities of each words given the active cells.

Here are some results I’ve got from this:

  • i ’ m not exactly to me , and i ’ m not sick where u get a challenge , the most important thing i will be strong and i miss describe the day i ’ m sure it to my mom and i ’ m so excited !
  • i gained 2 votes * cough * i ’ m going to the beach paragraph
  • i ’ m so bored , that ’ s about it .
  • i ’ m not going to do you thankful for the day " .
  • urllink this is the start of the cousin and i ’ m gonna go see it .
  • urllink this is me , and i can ’ t have to call you to know that i should actually a kid i ’ ll leave the one who ’ s the new layout for we just have to get that really does anyone ?
  • i ’ ve been quite , it ’ s the site .
  • i don ’ t have a good way too much better than your knees of the week on a good week - it ’ s a urllink good blogging . wish me luck this post is a link to my blog !
  • i have to get out of my mind looking for my dad , i ’ ve got a raise . i ’ m not doing nothing to do this is the start of the house are you ? brought to you by urllink quizilla there ! i ’ m not that way .
  • urllink hello ! not the end of the rings for my mom and urllink anybody part two are some people who can be found urllink here .
  • urllink which evil to get a lot of thing should be with you i missed a little while . the group and the fact that i post a good day !
  • i ’ ve never been so happy again . i get out of here ! and i am so worried this new one to the library .
  • i think i ’ m going back to the beach .
  • well , i ’ m not in a grass , i ’ m not going to get me up .
  • happy birthday to all my friends get a new template .
  • i was gonna die , u by telling for no matter how much i ’ ve got ready
  • i ’ m addicted . 0 0 4 5 0 0 3 & # 3 7 2 days left until that ’ s all the counts .
  • i lost two my place are you ? brought to you by urllink quizilla
  • i ’ m not the cinema complex i had to admit , , . hehe . thank god bless !
  • and i ’ m just gonna go boston ! ! ! !
  • i gained 2 6 1 _ ^ ^ ^ _ ^ 2 , 1 0 0 2 7 _ _ _ _ _ _ _ _ _ 1 th . f emergency to the right thing i ’ m not going to a party .
  • my results is doing the current is there around to post for the girls as a lot better to figure out urllink 7 0 , not until cheese cheese cheese .
  • i ’ m a fruit , we ’ re on fire .
  • i ’ m not the cinema to see her on a nice break are you ?
  • dang may be fun . only be happy . lol .
  • urllink create your own . i ’ m going to be really afraid of my system up a few drinks than i have to say the last two people need to all , , ( ) . you . but i dont have a lot .

… Note that it’s just an experiment to measure the performance of bitHTM. :scream:
I guess at least they have some structure to them?
urllink is a token from the dataset that represents a URL link.
This is generated only from one input, the “post begin” token.

I’m sure you could get better results with a careful tuning of the setup.
The code of this experiment is here.

I am happy to answer any questions you might have. :smiley:


@hsgo this is amazing!

Absolutely. These are grammatically correct sentences. I know people who do worse. :-D.

These have got to be my favorites. :-).

Absolutely fantastic. Great job, man!


This is amazing! I can start to think of so many things that can be possible using HTM. I’m checking out your code right now, perhaps it may be possible to apply it to reinforcement learning.


@ayaz-amin I’d highly recommend Etaler by @marty1885 tho. It’s more complete and production-ready. :slight_smile:


Very interesting work.

Then I fed encoded word representations to bitHTM like you’d do with any other HTM systems

This sounds like theres a standard way to do word representations. I just play around with word representations and SDRs. Can you give an example how you did the word encoding/word representation?

As far as I know, there is no “standard” way of encoding text for HTM.

As it’s described in the original post, I trained a deep learning network to encode words with a simple loss for constraining the sparsity.