I have a question regarding NAB scores. I am currently evaluating a few algorithms with my own dataset. I am evaluating them based on their optimized and normalized scores. However, some of the scores are negative values. To me, this implies that these algorithms perform worse than the null detector in that they raise false positives that end up affecting the total score. But, shouldn’t the optimizer find a threshold value that is high enough such that there are no anomalies detected at all, which should ultimately lead to a better score (of 0)? Any advice to help me better understand this behaviour would be greatly appreciated. Thank you in advance.
First, can you confirm that you are running all steps of nab? The
run.py command takes the following arguments to run each of the four steps:
--detect --optimize --score --normalize. If you don’t run the optimize step then it could be using a bad threshold.
If you run all of the steps and somehow the threshold isn’t getting set high enough to avoid false positives then perhaps there is a bug in the optimizer. In this case, you can manually modify the threshold in
config/thresholds.json to be larger than one (the max anomaly score) and then rerun the scoring and normalizing steps (make sure not to rerun the optimize step since it will overwrite your manually entered thresholds). This should guarantee that you get scores no worse than the null detector.
Please report back with the results so we know if there is a bug in nab!