Hey @steinroe, so I have my own python module which builds multiple NuPIC anomaly detection models from multivariate time series.
It samples the first N rows to calibrate the encoders, then instantiates the models, applies them to the rest of the rows and saves them to disk. This yields a separate anomaly score for each field (csv column) at each time step, which can then be combined, thresholded or otherwise analyzed.
I made it for my own research involving n-dimensional movements but generalized it with a config file, so you give it the file paths, N first rows to sample from, whether you want 1 multivariate model or multiple univariate and a couple other things.
Right now I don’t have it shared on Github, though if you’d like to test it out and would cite the repo I’d consider sharing it.
Also @pizzatakeaway (funny handle), I glanced at your work and noticed this:
the algorithm was not able to tell which feature was responsible
I think the multiple univariate model approach should help with this, since you get separate anomaly scores for each model at each time step.
Also have you seen NAB (Numenta Anomaly Benchmark)? In case not I’d definitely recommend checking it out. Its purpose is to test HTM against other anomaly detectors including Etsy Skyline and Twitter ADVec across a range of data sets from numerous domains. I don’t think it includes RNN or LSTM as you do though. The code actually has a ‘create_new_detector’ script too, so you could potentially plug in your RNN or LSTM detector to it. That’d allow you to quickly test those or any detectors over like 55+ data sets, which already have labeled anomalies and built-in detector-scoring functions.