Implementing HTM.Java on Apache Spark

papajim · May 4, 2017, 11:12am

Hello,

I’m really intrigued by the HTM theory and I’m thinking of implementing a version of the algorithm on Apache Spark. I’m aware of the Apache Flink and Akka implementations but I would like to get my hands dirty and implement a Spark version.
However, I was wondering if there are any known limitations that have prevented others from implementing HTM on Spark.
What I mean by limitations is, if the Spark framework is unable to run HTM at scale, if it’s micro batch (streaming) isn’t suitable at all for HTM etc.
From my point of view I cannot see something that will prevent me from running HTM on spark, even scaling up the algorithm. I might be wrong though…

It’s weird that I couldn’t find any implementations on Spark, just some old mailing list threads…

Thanks !

cogmission · May 5, 2017, 2:07pm

Hi @papajim,

There is a community member @EronWright, who implemented a Flink version that you can check out. At some point, I had it in mind and so did the company I work for Cortical.io (sponsors of HTM.Java) - have it in mind to do a Spark implementation - however I have had (and still have) a more pressing goal I’m working on right now. As far as I know however, there should be no limitations present for doing a Spark implementation though. Last year, we worked on a Persistence framework to be able to serialize HTMs to stream or disk - in anticipation of eventually doing a Spark implementation - then Eron came in and did a Flink implementation.

EDIT: There is also an AKKA implementation too!

EronWright · May 25, 2017, 5:22pm

Thanks @cogmission. @papajim the beauty of using HTM in Flink is that you get a true streaming system with good connectors to various data sources (e.g. Kafka), that scales by creating separate HTM model instances for each ‘key’ in your data, with full checkpointing support.

Here’s the integration library for this:

I’ve been meaning to update the library with the latest Flink and HTM dependencies. With renewed interest from you and others, I’ll prioritize that.

avichai · May 28, 2019, 7:10am

Hey,
did you eventually implemented it in Spark?
If so can you share your findings? maybe your Github project?

Dima · February 7, 2020, 4:03pm

I am also curios about the outcome of your implementation, because I am planning to do the same.

Topic		Replies	Views
Flink Anomaly Detection HTM.Java	1	1749	July 24, 2017
Htm anomaly values differs in two project HTM.Java htm-implementations	10	1124	September 2, 2017
To implement HTM on Raspberry pi 3 NuPIC htm-implementations	9	1676	November 13, 2017
About integration of Flink-HTM HTM.Java htm-implementations	5	911	June 20, 2017
Performance optimization of HTM.java HTM.Java	15	1614	September 3, 2017

Implementing HTM.Java on Apache Spark

Related topics