I've began to write up an implementation of HTM using OpenCL to improve performance since there's loads of places in the algorithms where we can parallelise. Any help would be appreciated (especially ideas about how the Temporal Memory can be parallelised) so issues and pull requests are welcome. Once it's done I'll write a paper about it as part of my PhD work.
Currently, I plan to implement:
- Spatial Pooler
- Temporal Memory
- CLA Classifier
- SDR Classifier
It'll probably take a while since a lot of work when writing OpenCL involves experimenting to find the best ways of dividing up the workload.
Ideally it would be a drop in replacement for the existing algorithms in Nupic (with hopefully better performance).