Combining multiple metrics into a likelihood

Hey there,

I’m trying to implement anomaly detection along the lines of the discussion in supplementary section S4 of the “Unsupervised real-time anomaly detection for streaming data” paper (Pages 6-8 of https://ars.els-cdn.com/content/image/1-s2.0-S0925231217309864-mmc1.pdf)

Let’s say I have two models, which output prediction errors s1_t and s2_t at every time, t. The goal of the discussion is to be able to detect if the prediction error of the first model spikes at a different but close time to the second (i.e. s1_4 and s2_5 are spikes). I’m very confused why they propose including G, a Gaussian convolution kernel, which seems to use x, the input to HTM (the raw value).

Could someone walk me through the math of this section, and if possible, how one would implement it?

Thanks!

Are you talking about the “anomaly likelihood” process? If so, this is not a part of HTM theory, it is an optimization specifically added for anomaly detection. There is no link to any biological process here.

In any case, I don’t know the maths well enough to walk you though it.

All good, and my bad! Misunderstood the categories.

I’m mostly curious if anyone has implemented a successful implementation of the ideas discussed in the paper’s S4 section. Or anything similar regarding running multiple models in parallel and trying to combine their output signals :smile:

2 Likes

I’ve only done a couple of small experiments using real data to satisfy myself it works and might help. Would be very curious to see if anyone has success (or not) with it!

2 Likes

Do you think it would be possible to share some code for those small experiments? I’m a bit lost in the process of translating the math in the paper to code :smile:

Hey @efriis,

So I had this same question about implementing the Anomaly Likelihood myself. First I’d say if you prioritize implementation over knowing the algorithm you can just go:

from nupic.algorithms import anomaly_likelihood
anomLiklObj = anomaly_likelihood.AnomalyLikelihood()
...
for each raw_input to TM:
   ENC = encode( raw_input )
   TM_InputCols = spatialPooler( ENC )
   TM_PredCols = temporalMemory( SDR ) 
   A_Score = calcAnom(TM_PredCols_t-1,TM_InputCols)
   A_Likl = anomLiklObj.anomalyProbability(raw_input,A_Score)

Unfortunately I don’t have any code for the multiple metrics equations in S4. The small experiments were done in Excel using proprietary data that I don’t think I am allowed to share. Here’s a small version with random data. If you know Excel you can see how the computations were done.

The file assumes you already have the likelihood. You can use @sheiser1’s code to see how likelihood itself was computed.

This was done a couple of years ago, I did not spend much time on this, and it is not cleanly done. No guarantees - you’re mostly on your own here!

@efriis Did you achieve anything here? I am also struggling the the implementation of what is described in S4…