First off, hello HTM forum! This is my first post but I’ve been around for a while. I’ve waited until I had something to show you guys.

I’ve been interested in AI(or MI as you call it) since I was young and read the Korean version of ‘On Intelligence’ and I was very impressed. Shocked, even.

I’ve studied and implemented HTM and deep learning systems couple of times and have moderate understanding in both of them.

Then I thought, why not combine HTM and deep learning? I’m sure everyone in this forum has thought of it at least one time.

Some might think it’s not possible but deep learning is just all about using backpropagation to a function and optimizing errors by using the derivative. Since everything including HTM can be expressed as a function, so why not?

So I’ve keep tried using different approaches until it succeeded.

Of course, HTM and deep learning are so fundamentally different so it was not easy.

HTM uses binary and for deep learning, typically decimal although it’s totally possible to use binary but I thought using decimal is more interesting so I’ve stuck to it.

I’ve faced a lot of problems such as information flow between the spatial pooler and the temporal memory. But I’ve tried to stick to the HTM philosophy as possible.

I call this version of HTM ‘DeepHTM’.

I’ve implemented this using purely C++.

Using backprogation on HTM comes with several advantages:

- You don’t have to hardcode the encoder and the decoder(SDR classifier?).

1-1. You can easily feed complex data types like images to HTM as you don’t have to hardcode the encoder and can even generate images from the output of HTM which will be near impossible to do with hardcoding. - You can easily have a hierarchical structure with HTM.(in the plain form of HTM, it’s not so effective with just SP and TM, I think?)
- You can easily test a theory with deep learning as even without explicit learning rules, backpropagation will figure out what to do with the information it has. You just have to decide how the information flows.

… And so much more!

Top left: the network structure. Top right: first 1000 steps of training.

Bottom left: 4000 steps of training. Bottom right: after 10000 steps of training.

(The site doesn’t let me post with multiple images since I’m a new user so I had to do it this way.)

The network structure:

The input is sine value ranged between -1 and 1.

The encoder consists of two fully connected ReLU(it’s a deep learning thing) layers with dimensions of 256 units(cells) for each layers.

The spatial pooler consists of 256 columns with 5 winner columns using global inhibition(done by backpropagation not simple boosting) and the temporal pooler has 5 cells for each columns.

The decoder consists of two fully connected ReLU layers with 256 and 1 units for each.

The output is the predicted value of next input.

This runs about ~1500 times per second and last time I checked, doesn’t suffer from scaling problems.

Well, It’s not the best but the model was too small for deep learning standpoint(some might say It’s not even deep learning) and even for the HTM parts.

And as for the training time, I’ve used the most basic version of backpropagation and in today’s deep learning, there are nifty tricks that makes training much faster and even more stable.

It definitely needs improvements but it’s just a proof of concept.

I’m planning on implementing this to run on GPU by compute shaders and combining more deep learning stuff such as convolutional neural network.

I’m also planning on implementing a differentiable grid cell module and the sensorimotor theory.

I’m happy to answer any questions you may have!

P.S. English is not my native language so excuse me as I might have some mistakes here and there… :’(