NAB: codependent metrics from a single system?

Oblynx · March 11, 2019, 9:09pm

Does any of the NAB datasets provide different metrics from the same physical system, that would therefore provide different views into the same behavior? For example, do the different realAWSCloudwatch metrics come from the same machines, or from somehow correlated machines (maybe single cluster)?

The idea is to get a set of “codependent” metrics, in the sense that information from one of them can help predict another. If NAB’s metrics don’t have this property, I’m researching a few possibilities in Kaggle – but any suggestions are more than welcome

sheiser1 · March 11, 2019, 9:51pm

Good question, I don’t know the answer though to search for ‘codependent’ metrics you could theoretically use swarming. Just set the predictedField to whichever metric, then the swarm will tell you which other(s) were found most helpful in predicting that metric (along with optimal hyperparameter settings). You could do this for as many predicted metrics as you like and keep track of the ‘codepencies’.

Oblynx · March 11, 2019, 10:13pm

True, we could perform statistical tests of the predictive power of some of the metrics on others. In this case though, I’d prefer a theoretical reason to expect a codependence, such as the metrics coming from different sensors on the same system.

Let’s say that the system (which you only know through the metrics) is in a state A; then all the metrics will embed some info about state A. Then the system transitions to state B, and so the metrics will now embed info about B. (Hidden Markov Model )
If my task is to guess the state of the system from the metrics, and no metric is sufficiently informative on its own, maybe I can combine their info. This is the scenario I’d like to work on with NAB.

sheiser1 · March 11, 2019, 11:18pm

Agree for sure, without something tying the metrics together the codependencies found would be coincidental.

Bitking · March 11, 2019, 11:35pm

I spend considerable effort in my daytime job sorting out the 4 basic measurement error sources (linearity, span, zero/offset, non-repeatability) and the way these things can interact in electronic scales.

How hard can that be, really?

If you take the basic binary table of 2 inputs & one output as possible ways that two two errors might combine, toss in variable scaling between them for each of the listed error terms - you are already deep into the realm of pseudo-random numbers.

Start to troubleshoot mechanical systems with many components bolted together and you move firmly from science to the dark arts. The flex of a support beam might turn out to be as much as the deflection of the load cell it is supposed to be supporting. The output from that sensor may be very complex; and it could well have some very messy relationship with other sensors on the same system.

Good luck with putting ear tags on your inputs and sorting them out as they work though your system.

rhyolight · March 17, 2019, 1:18am

I don’t think so.

Oblynx · March 17, 2019, 8:10am

@rhyolight Thanks. I am considering the data used in this paper instead: https://arxiv.org/abs/1703.07015
I found it here: https://paperswithcode.com/task/multivariate-time-series-forecasting
Their code and data is open on GitHub.

I’ll report back on how clean and useful these data are and what results I can get with them. Maybe add them in NAB after?

Topic		Replies	Views
Naive predictor: how compute anomaly score from 2 scalars? Include error metrics into NAB? Is anomaly score the optimal metric? NAB	2	594	July 10, 2019
NAB: faster optimization and scoring NAB	10	1595	January 10, 2019
Anomaly detection in multiple data sreams NAB anomaly-detection , question	2	1638	May 14, 2018
Ideas for the Numenta Anomaly Benchmark Competition! NAB	4	1431	June 28, 2016
Finding the predictability of a pattern of anomalies with HTM NuPIC	4	463	April 10, 2019

NAB: codependent metrics from a single system?

Related topics