About Anomaly Detection Thresholding

Hi,

Thanks a lot for responses to my previous questions!!

I get two values while anomaly detection: raw anomaly score and anomalylikelihood.
Which strategy do you recommend for me to use from the followings for thresholding?

  1. only use anomalylikelihood > 0.95 (for example)
  2. only use raw anomaly score > 0.95 (for example)
  3. use both anomalylikelihood and raw anomaly score together with AND operation

Thanks.

We have found that the likelihood value is generally more useful as a metric. Your work will now be to decide the threshold for this value, and 0.95 is probably a good place to start. Move this up and down to get to a place where truly anomalous behavior is flagged, while other behavior is not.

For the anomaly likelihood, we actually set thresholds like 0.99995 (yellow) or 0.99999 (red). There is also the “log likelihood” that converts from that range to a more visualizable range of values that we set thresholds of 0.4 (yellow) or 0.5 (red).

In other words, I’d expect a LOT of false positives with a threshold of 0.95 on the anomaly likelihood.

2 Likes

Thank you rhyolight and scott!

I find many data point where anomalylikelihood is very high (say 0.99995) but raw anomaly score is 0.0.
You mean I can ignore what the raw anomaly score is. Am I understood correctly?

Here is another resource that describes the details, but yes, I suggest you ignore the anomaly score and use the likelihood entirely.

Thank you very much!!!

Do you have any comparative studies between NuPic and TensorFlow NN about accuracy of either approaches to a simple problem like recognition of fixed size numerical/number images?

I understand the differences between traditional ML and HTM. I am trying to assess is where HTM based systems today in relation to current state of tradition ML.

Here is the link to the Google Tensor Flow presentation with pretty significant learning accomplished. https://www.youtube.com/watch?v=u4alGiomYP4

Thanks in advance
T4FP2

We don’t have direct comparisons of HTM to TensorFlow, but you must understand that TensorFlow is a platform for machine learning, and you can run many algorithms with it. It is set up for spatial scaling and processing, not necessarily temporal. There is no comparison to make here, really.

However, we do have benchmarks comparing HTM doing streaming scalar anomaly detection vs other ML techniques that do handle temporal data (LSTM for example) in the Numenta Anomaly Benchmark.