HTM.Java Newbie here. I have been reading up and playing with the code for about 6 months now.
I have a setup with a bunch of Raspberry Pis (running Ubuntu Mate 16) and ingest the produced data streams into Flink for processing in HTM.java (Via SQS. Kinesis, Kafka -what have you).
I’m primarily interested in detecting anomalies on the data streams produced by the RPI (Raspberry PI) instances the and working on building an unsupervised system to detect anomalies in the streams.
Please excuse my naive questions…
Can I model a “Network” to be the corresponding object for an RPI instance while individual regions within the network observe in observe one metric stream and provide anomalies
For e.g Network (N1) has multiple Regions (R0—Rn), R1 --> Temperature, R2 --> Pressure etc ?
Is this a viable solution
Once a Network has been “started” can we add regions dynamically?
What does “connecting” regions actually do? What is the purpose of “connecting” regions?
If I have a multiple field stream and ( from reading up on the many forums there are a limitation to the fields that we can input) is there a way to mark fields not to be used for detection?
What is the minus required setup for Anomaly Detection in such a scenario?
Sorry for the delayed response, I’m on vacation right now (until the 18th), and not checking the forums all that frequently. Let me attempt to answer some of your questions…
First, I would use a Network consisting of 1 Region and 1 Layer. Multiples of those are an attempt to make room for the porting of later hierarchy development Numenta may engage in at some point in the future. But to answer your question, “connecting” regions and layers sets up a chain of sequential grouped algorithms and tells the infrastructure what order to put the constructs in.
As stated before, just use 1 Region and 1 Layer - and input your comma separated or otherwise data (maps are an option also). If you want to process more data, than use separate Networks which contain 1 Region and 1 Layer.
No. The act that “seals” the Layer is the close() function which is called explicitly or called automatically when running the network (after calling start or after the Network receives the first record of data).
At this point in time, no. That is something we are going to work on soon. For now, only input the fields you want to do inferencing on.[quote=“madhan, post:1, topic:1471”]
6. What is the minus required setup for Anomaly Detection in such a scenario?
[/quote]
I assume you meant “minimum” and not minus?
For an example of how to set up for anomaly detection, use something like the HotGym demo in the htm.java-examples repository. These examples are one version behind and may not work correctly at the moment because we just did a release last week, and so they are a little “behind”. I apologize for that.
Once you get the “mechanics” of how to use HTM.Java, it is the same as running the Python version of NuPIC, so any “data” or conceptual questions you have can be answered in the “nupic” forum (until I get back from vacation)
Sincerely appreciate your taking the time form your vacation to answer my questions.
Enjoy your vacation … I’ll work on a model along your suggestions and share what I have.
Thanks so much!
If you are only interested in anomaly detection, there is no need to include a classifier in your network. The wiki shows the minimum configuration to do anything like below:
Parameters p = NetworkDemoHarness.getParameters(); // "Default" test parameters (you will need to tweak)
p = p.union(NetworkDemoHarness.getNetworkDemoTestEncoderParams()); // Combine "default" encoder parameters.
Network network = Network.create("Network API Demo", p) // Name the Network whatever you wish...
.add(Network.createRegion("Region 1") // Name the Region whatever you wish...
.add(Network.createLayer("Layer 2/3", p) // Name the Layer whatever you wish...
.alterParameter(KEY.AUTO_CLASSIFY, Boolean.TRUE) // (Optional) Add a CLAClassifier
.add(Anomaly.create()) // (Optional) Add an Anomaly detector
.add(new TemporalMemory()) // Core Component but also it's "optional"
.add(new SpatialPooler()) // Core Component, but also "optional"
.add(Sensor.create(FileSensor::create, SensorParams.create(
Keys::path, "", ResourceLocator.path("rec-center-hourly.csv")))))); // Sensors automatically connect to your source data, but you may omit this and pump data direction in!
network.start();
…But what you would want in that case is this:
Parameters p = NetworkDemoHarness.getParameters(); // "Default" test parameters (you will need to tweak)
p = p.union(NetworkDemoHarness.getNetworkDemoTestEncoderParams()); // Combine "default" encoder parameters.
Network network = Network.create("Network API Demo", p) // Name the Network whatever you wish...
.add(Network.createRegion("Region 1") // Name the Region whatever you wish...
.add(Network.createLayer("Layer 2/3", p) // Name the Layer whatever you wish...
.add(Anomaly.create()) // (Optional) Add an Anomaly detector
.add(new TemporalMemory()) // Core Component but also it's "optional"
.add(new SpatialPooler()) // Core Component, but also "optional"
.add(Sensor.create(FileSensor::create, SensorParams.create(
Keys::path, "", ResourceLocator.path("rec-center-hourly.csv")))))); // Sensors automatically connect to your source data, but you may omit this and pump data direction in!
network.start();
where the line: .alterParameter(KEY.AUTO_CLASSIFY, Boolean.TRUE) would not be included. That line is the line which tells the NAPI (Network API) to automagically set up a classifier.
I also wanted to point out something else since you mentioned “flink”. A community member (@EronWright) created a framework for distributing htm.java networks concurrently using apache flink.
Also, you can get some ideas of how and why you might vary the network’s configurations by having a look at the NetworkTest and the LayerTest in the “test” directory.