Naive predictor: how compute anomaly score from 2 scalars? Include error metrics into NAB? Is anomaly score the optimal metric?

Morning good fellas,

I have begun fiddling with NAB and have 2 questions for you.

Technical: “How do I compute anomaly score for time-series from only 2 scalars: current, predicted?”

I have implemented a simple, yet hard-to-beat baseline model “naive, or random-walk forecasting”. Simply in carries previously seen value. So Prediction(t+1) = Input(t)

Now, the predictor is embarassingly easy. But I got stuck on getting anomaly score from the 2 values?
(my feeling is something as

diff = abs(current - predicted)
norm = std standard dev of diffs over the timeseries
score = something like sigmoid(diff / norm)

Any better implementation ideas?
You can see my code

And theoretical: “Is anomaly score metric the best for comparing time-series algorithms? Is it worth to include other metrics? (standard MSE, R2,…)”

In the optimal case: best predictor == best anomaly detector in sense of predicting T+1 value on TS (time series) data.
But for other cases?

I’d like to add error metric score to NAB:

This would allow us to compare eg the Naive detector which cannot do anomaly scores properly. Other argument is that papers more ofthen use err metric than anomaly score results.

Cheers,

2 Likes

Thanks for your recent work on this Marek. We are going to have to look into this soon, but we’re not in a rush (yet). We want NAB to run in Python 3 by end of year. Please stand by, hopefully we’ll have some more resources soon…

1 Like

Thanks for the headsup, Matt!
That’s not too soon, given all work depends on the py3 change. But good to know the situation, I’ll open PRs for both the official NAB, and meanwhile integrate them in the community/NAB.

Coding aside, would be cool if we can discuss some concepts (like acceptance of the err metrics) here or on GH, so people (I) can get feedback before working on it.