Evaluating the performance of the real-time anomaly detection included in the Elastic Stack

I am about to do my MSc thesis and I have in mind to do a comparative study between the Anomaly Detection algorithm included in the Elastic Stack and 1-2 other Frameworks/Implementations.

I was wondering whether it is possible in some sort of way to use the NAB to evaluate the performance and the quality of the Elastic Anomaly Detection algorithm. I know that the Elastic ML is open source, does anybody know whether it has already been evaluated previously?

Thank you in advance for your help.


Definitely possible, see NAB Entry Points.

It is not on the scoreboard, so go for it!


Hello again,

I have tried to understand for the past two weeks how to do it, but I really find it very difficult.
That repository is pretty complex and I do not understand if it is feasible or not. Could you provide me with any help/more documentation on the entry points?

Sure, NAB runs natively in Python 3. What environment does Elastic Anomaly Detection algorithm run in? If there is a way to run it in Python 3, that makes it easier.

Elastic AD algorithm is written in C++. Setting up a build environment for ml-cpp native code is complex, you can have a look at this: build-setup.
Normally, the Anomaly Detection Jobs are run through the GUI of Kibana, but this won’t be useful at all for registering the score in the benchmark.

Maybe I am underestimating the complexity of doing such thing. If you think this cannot be achieved in the scope of a MSc thesis, considering it won’t be my one and only focus, please feel free to tell me.

Any suggestion would be appreciate.

It can be achieved. I suggest you follow the example of the twitter advec detector, which is written in R, executed in its own runtime on the NAB dataset, and integrates through a file interface with the scorer in NAB. The primary work is getting the NAB datasets into your environment, running the algorithm, & outputting the intermediary files required by the NAB scorer.


Well, I looked at that but it seems very different to build a detector for the elasticsearch code. It is very hard to build and in that repo there is not only the anomaly detection, but the entire ML tools of elasticsearch. So forecasting, regressions, etc.

I was wondering whether I could achieve this with a more analytical approach, like running the anomaly detection on the Numenta dataset and reporting the scores manually in the right format. Or something like this. I am asking this since, the elasticsearch AD is meant to be run through GUI.

The only problem with this strategy is that when I want to run an AD job with elasticsearch I need to choose the bucket span for the time series and I think this will affect the detection. Are there any workaround?

Thank you

What do you mean by this?

When analyzing data, Elasticsearch uses the concept of a bucket to divide up a continuous stream of data into batches for processing.

For example, if you were monitoring the average response time of a system, using a bucket span of 1 hour means that at the end of each hour we would calculate the average (mean) value of the last hour’s worth of data and compute the anomalousness of that average value compared to previous hours.

You can read more here.

It sounds like there may be some complications getting data into a format that works for Elasticsearch? I don’t have the time to look into this. I with you luck