Is HTM right tool for my case? (Online anomaly value detection in data stream)


#1

I have a constant stream of events containing some differing value which represent milliseconds of some duration. I also hava another stream of values that need to be compared against the model - detect if they are anomalies or not. Is HTM AnomalyLikelihood class right for this task? I was trying to set up estimateAnomalyLikelihoods and updateAnomalyLikelihoods but the API doesn’t suggest how it needs to be done, documentation doesn’t help either.


#2

You have two streams? Or are you talking about a training dataset and a test data set?


#3

I have only two streams, no dataset. First stream should shape the model, second one acts as a queries for the model.


#4

Right, so you have one training stream, one testing stream. You train on one set of data like you said, letting the model learn its characteristics. In this case, this should be non-anomalous data.

Then you want to turn learning off in the model once it has seen enough, and pass in the “real” data, which might contain anomalies. While keeping learning turned off in the model, it will report anomaly scores/likelihoods based upon comparing the current data stream to what it was trained with.

It will not be able to tell you why an anomaly was detected.

Also, if you are processing at millisecond precision, you won’t be able to benefit from human time encoding (like hours, days, weeks, etc). If you plotted out this data would you be able to spot patterns an anomalies yourself?


#5

The system should detect if “something takes too long”. It usually takes about 500 millis (mesured average) but can differ depending on some other parameter. I could just check if the action takes for ex. 10 times longer than the average but I am wondering if there is a smarter solution.