Homegrown python 3 HTM implementation

Hi folks! 3 months ago I set out to learn more about HTM and python. I figured the best way to do so was to write my own HTM library. I’ve decided to share it since it may help other newcomers like me learn HTM. It’s very far from finished, but I’m proud to share the first working version! It’s pretty sparse in functionalities (pun intended) but the SP and the TM both run. I have a basic example script with a contrived sine wave signal on the repo, and I plan to upload a HotGym example as well.

Edit: I forgot the link. https://github.com/Carlton941/PyHTM/blob/master/README.md

Edit again: if anyone actually takes a look and is horrified by my code and/or finds a bug, let me know! I’m a new pythoner and always looking to improve.


I got a 404 error when I clicked on it.

Yes you’re right to test on the Hot Gym data of course, since it is well studies and has higher order patterns – which a basic sine wave won’t.

1 Like

Haha, my first bug! I’ll fix the link when I get home.

1 Like

I had it set to private by mistake. whoops!

1 Like

Nice!! Are you sure this is python 3? I see you are still using most of nupic libraries which are implemented in python 2 [as far as I remember].

Suggestion: There is no need to encode the days and weekends separately. You can have both in one statement.

1 Like

Huh? I didnt import nupic at all–did you actually go to the repo posted by Psdyer? I cited it in the readme file for my repo since I used it as a guide, but the code in the PyHTM file doesnt use nupic.

Good point! I never even considered that.that will save some space.

Edit: here’s a direct link to the code to clear things up! https://github.com/Carlton941/PyHTM/blob/master/PyHTM.py

1 Like

Good job! Your code is very clean and readable :slight_smile:

… I’m in the middle of doing almost the same, so it’s a nice +1 place to compare with or get inspiration from


Thank you!

And good luck with your project as well =)

I’ve made an example processing the HotGym data, and the results are mixed.

  1. I trained an sklearn regressor on the active cell SDRs and use it to translate the predictive cell SDRs into predicted energy usage. When plotted alongside the actual usage from the .csv file, it matches fairly well.

  2. I added a custom anomaly score calculator and I plot these alongside the energy data. The anomaly scores are a bit all-over-the-place. I’m not sure if this is because of the a) parameters I’ve chosen (sparsity, encoding resolution, etc. etc.) or b) because of the way I’m calculating the anomaly scores, or c) some other issue I haven’t thought of yet.

1 Like

I’d thoroughly investigate this before fully trusting the implementation, since it could mean a bug somewhere.

I think the simplest way to test a TM implementation is to use a noiseless sequence that reuses certain inputs, like the basic one:


It should be that the anomaly scores settle to 0 fairly quickly and the number of predictions should settle to 1 a bit more slowly. There are also unit tests for TM which you could use, though I’m not sure what they test for exactly.

Here’s a figure showing one of my first tests, using the simple sine-wave data with no noise. There are a couple of deliberate discontinuities and a region where the frequency increases. The green line in the second window shows the translated predictions, with anomaly scores in the bottom window. I believe this shows the expected behavior–is that correct?

Looks like it to me. I’d be curious to see the same on a more complex sequence, even 2 different sin waves stuck together where it would need a couple iterations of the total sequence to learn. I’d also recommend plotting a measure for the amount of predicted activity, like: number of predictive TM cells / number of active SP columns. I think with the simple sine there’s no ambiguity so there should no more than 1 prediction at any time. With a more complex & ambiguous sequence the system should make multiple predictions at certain points.

1 Like

You mean, for instance a few sine waves with different frequencies multiplied or added together? I’ll try that out and plot the prediction count! Could be enlightening.

1 Like

Right something like that, where the pattern is there but just much less trivial to learn. That should give a bigger picture of how the system is behaving.

1 Like

That’s a good idea, I never actually tested the most basic example to be certain my TM was running properly. Happily, I did a simple test with A B C D, A B C D, A B D C, A B D C and plotted a) the prediction count and b) the anomaly scores. It shows exactly what I expected, so I think this confirms the TM is working correctly.

I also think having this example on the repo is really good for anyone else, like me, who wants to learn how HTM works and wants to mess around with a simple example so I uploaded a script for it.

Basic Sequence Example

1 Like

These values look good. The only thing I’d add to this test would be “X,B,C,Y” attached to the “A,B,C,D”, to create some ambiguity.

Totally agree, thanks! It’d be great to have the same for NuPIC too.

1 Like

That’s a good point, it would be more thorough to test a sequence that starts differently but has the same middle to see if it can tell the difference between B preceded by A and B preceded by X. I should expect only one prediction in that case–not two–right?

Whereas in the test I did, two similar patterns A B C D and A B D C it has to predict two values, since it doesn’t know what follows B yet.

1 Like

I tested this, and found some interesting behavior. I noticed that even with the simple sine wave there were several predictions being made at once. I was wondering why, especially after it performed well with the simple sequences ABCD and ABDC. I’ve discovered that it has something to do with the encoding density (EDIT: I should have said something more like, the density of the values to be encoded). I took two sinewave examples, one with widely spaced out values and one with more closely spaced. Both used the same encoder and pooler. In the first case, there was no ambiguity about the predictions–only one prediction was made at any time (until an anomaly showed up). In the second case, there were a couple of predictions being made at any time (and this number increased after the anomaly).

Any thoughts on this?

1 Like

I suppose it could just be that the closer the datapoints are, the more similar the encodings and the harder it is for the system to differentiate them. So the representations might fluctuate a little and make one sequence look like multiple, very similar but separate, sequences.

Oh ok, I didn’t catch the different bolded C & D.

I think you’ll get 2 predictions from B & C at first before it learns the full higher order context, then it should drop to 1.

1 Like