Scaling 'hot gym' for multiple locations

phil_d_cat · May 15, 2017, 12:13pm

How would I go about scaling up an app like the ‘hot gym’ if the data was coming from multiple locations? Would iI need to create a Network for each location and retrieve / store this as each value is sent through?

IE Assuming the data is coming in as {timestamp, param0, param1, param2, result} and the values for param0-2 make up a “rich key”. Assume I only want to compute anomalies or make predictions on the values for result.

Further: in this topic you reference the INFERRED_FIELDS_MAP. Can you go into more detail about how to use this and how it would apply here?

cogmission · May 15, 2017, 12:33pm

Maybe I’m not understanding you, but I don’t see how data coming in from multiple locations matters to the HTM? If you mean scaling in terms of processing speed, you are only going to get the speed you get and multiple models aren’t going to help that because each model needs all of the data because models in HTM-Land process sequences any of which if missing, would result in inaccurate inferences?

The INFERRED_FIELDS_MAP is simply the subset of fields (out of all the fields in your input), which you want to detect anomalies in or do prediction on.

And is only relevant if you are doing “Prediction” as opposed to “Anomaly Detection”, because its purpose is to assist the “Classifier” assembly in creating classifiers for only the desired fields in the input, rather than create classifiers for all fields - like it was doing previous to the INFERRED_FIELDS_MAP settings and code.

phil_d_cat · May 15, 2017, 3:07pm

Maybe I’m not understanding you, but I don’t see how data coming in from multiple locations matters to the HTM?

Im thinking of a system that would do anomalies / predictions for multiple, independent locations. IE The values used to do computation for one site wouldn’t impact / influence those of other sites

cogmission · May 15, 2017, 3:49pm

You would have to test it and see how many instances of the same model you need to run, depending on the request handling speed. One advantage is that you can run the same model everywhere since HTMs can be serialized. You would have to guarantee serial querying though, since the HTM isn’t guaranteed to be concurrency safe.

rhyolight · May 15, 2017, 5:17pm

I would first try a model per location. If that doesn’t give you the results you want, perhaps a multi-field model. But we’ve found that model-per-data-source does pretty well for anomaly detection.

Topic		Replies	Views
Help with understanding some of the fundamentals HTM.Java usage-help	4	1081	October 6, 2016
Getting anomaly likelihood NuPIC	7	584	January 29, 2020
Anomaly Detection - Hot Gym Tutorial : How to consider several variables? NuPIC	4	1442	November 2, 2018
Implementing HTM using Kafka Streams HTM.Java	7	1567	January 10, 2019
HTM multiple Fields NuPIC	19	1609	February 6, 2018

Scaling 'hot gym' for multiple locations

Related topics