Anomaly detection in multiple data sreams

Hi,

I am trying to test Numenta anomaly detection in a data coming from multiple data-streams.
I am trying to use the mathematical formula of applying convolution to each model and aggregating the results into a single anomaly likelihood value. (which is provided in the following paper [https://numenta.com/resources/papers/unsupervised-real-time-anomaly-detection-for-streaming-data/] )

First of all, is it implemented in the source codes of nupic or NAB benchmark? I went over the codes and didn’t find it.

Secondly, I tried to implement it, I could obtain the convolution value for each individual model. However, I have an issue in aggregating these values into the anomaly likelihood value. In the paper, it is stated that instead of multiplication of probabilities due to the precision issue, we have computed the sum of log probabilities. What I am confused about is that how to compute anomaly likelihood from the sum of log probabilities.

I would appreciate if anyone could help me in this matter.

Thank you.

1 Like

Hi,

Hmm, good question. We did a quick set of experiments using that technique on some data and showed it worked well on them. However we did not have much access to multiple data-streams where it made sense to use that technique, and never pursued a full blown implementation. (That’s why it’s in Supplementary materials.) We don’t have it in nupic or NAB.

It was a while ago and I’m afraid I don’t have a proper answer anymore. I would guess that if you ignore the 1- in Eq. S7, and just threshold the sum of logs of the M components of that product (i.e. 2 (G*Q)…), it will work well. Your threshold might need to be <= instead of >= to detect the anomaly.

I’m curious how well it works on your data.

1 Like

2 posts were split to a new topic: Serializing an anomaly detection model