Anomaly detection for multi features

MAK · March 3, 2019, 3:29pm

Hello,
According to the examples I founded in the site ,I see the number of predict attribute that was predicted for detected anomaly was 1.

> model.enableInference({'predictedField': 'consumption'})

I have three questions ,

If I want to predict more than one attribute , can I do that with NuPIC (for anomaly detection purpose)?
What is the recommended number of features that I can predict as ratio to the number of input features ? , can I predicate the all input features ? , or may that will decries the accuracy of the
prediction ?
If the answers to the first question is Yes!,How I tell to to the model to predict more than one feature model.enableInference({'predictedField': 'consumption'})?
Thanks.

marty1885 · March 4, 2019, 5:21am

Not being familiar with the full NuPIC reagons API. Just some experiences with HTM overall.

Yes, just use the MutiEncoder to encode all your data into one SDR.

Around 5~6 features are generally the maximum amount you can work with. And sort of yes, the more feature, the less accurate the predictions become (more possible corrilations)

A MultiEncoder!

github.com

numenta/nupic-legacy/blob/50c5fd0dc94f2ffb205544ed11fe82ad5bb0de18/docs/examples/network/example-create-encoder.py

from nupic.encoders import MultiEncoder

def createEncoder(encoderParams):
  encoder = MultiEncoder()
  encoder.addMultipleEncoders(encoderParams)
  return encoder

# Use the same modelParams extracted from the YAML file earlier.
encoderParams = modelParams["sensorParams"]["encoders"]

# Add encoder to the sensor region.
sensorRegion = network.regions["sensor"].getSelf()
sensorRegion.encoder = createEncoder(encoderParams)

MAK · March 4, 2019, 7:05am

Ok, thanks for your answers.
So it’s mean in the predict I need to define?

model.enableInference({'predictedField': '**my_multiEncoderField**'}

Another question, you write that the maximum number of predict features are 5-6.
Your recommendation is depending on the number of representing bits (n) of that features in the SP? For example if I have 10 parameters that share the 2048 bit span, and I try to predict part of them or all of them why this is changing the accuracy of the predict?
In addition in others multi features predict technic (like LSTM,CNN), if I have allot of features that I want to predict, with copmlex infrastructure I add more layers to my model, there’s something similar in the NuPIC, that give the ability to representing more complex infrastructure.
Thanks

marty1885 · March 4, 2019, 7:31am

Thanks for your interest in HTM and asking on the forum.

I’m not very familiar with the reasons API. But I do think that is the way to go. Maybe another forum member can help.

You predict all of them. A standard HTM network is composed with 2 layers. A SP, which reduces noise and unneeded information. And a TM, which predicts the next time step. Given a predicted SDR, you can ask SP to walk back it’s synapses to reconstruct the input field.

Unfortunately staking layers in NuPIC although is doable, it will not give you any extra performance in standard HTM. I (and other community members) am working on an implementation of the thousand brains theory which will allow multiple cortical columns to work together. But unlike staking CNN layers and extracting higher and higher level features, it utilizes multiple cortical columns to predicting simultaneously.

rhyolight · March 4, 2019, 5:14pm

Hi @MAK thanks for joining the forum.

Unfortunately not.

Moti · March 4, 2019, 8:27pm

Thanks…

MAK · March 5, 2019, 3:47pm

Hii @rhyolight thaks for your answers.
But if I am working with algorithm API like @marty1885 recommended, with multi-encoder for encoding multi features, why I can’t use the anomaly detector mechanism directly? Or even implement it according to document http://nupic.docs.numenta.org/1.0.5.dev0/guides/anomaly-detection.html#temporalanomaly-model

I try to understand the limitations of the existing HTM framework,
can I predict or recognize anomaly in case that I have more than one feature? (by Algorithm API) , in case that I want to predict more than 1 attribute ,may I can do that with setting all the features I want to predict in one multi encoder, and set this feature as predict feature ?

Thanks

marty1885 · March 5, 2019, 4:44pm

Hi
Since you asked: If I want to predict more than one attribute. I think rhyolight (aka Matt) is referring to reconstructing all bits of the input filed and convert them back into real values. Which is in fact nearly impossible. But sure, you can use the anomaly detector with multiple input features. That is doable and a quite useful feature of HTM.

rhyolight · March 5, 2019, 4:45pm

Yes you are able to implement these things yourself. And please do! and share with us when you are done

Yes you can. The anomaly indication takes into account all encoded features. You can do this through the Network API or use the OPF with a TemporalAnomaly model as shown in the hot gym anomaly params. You can add more fields and encodings to this model and as long as you adjust your model params in the SP appropriately (the number of encoded bits must match the SP’s input size) it should work.

rhyolight · March 5, 2019, 4:49pm

It is possible, and there is even an issue describing it. But decoders are required for this, so it would only work with simple scalar numbers today.

1113 · May 12, 2019, 10:50am

Hi All,

Thank you all for the valuable info you provide here:).
I’m a lil bit confused:

I want to build an anomaly detection app and I have multiple features. they are scalar but I only need anomaly score for them. from what I understood it is possible and I have to use ‘TemporalAnomaly’ as the inference type. Am I right? now if I do when creating the model I need to call the model.enableInference function, but it receives only one field…:(. what is the solution here?
do i have to create the model in some other way or use a new encoder for my multiple features?

thanks in advance

arkady

sheiser1 · May 13, 2019, 3:59am

I’m doing multivariate anomaly detection too, and the approach I take is to have parallel NuPIC models running, one for each feature (often referred to as ‘fields’ in NuPIC lingo).

I generate the models params file for each field’s model with:

MODEL_PARAMS = getScalarMetricWithTimeOfDayAnomalyParams(
    metricData = my_dataframe[[my_fieldname,my_datetimefieldname]].values,
    tmImplementation = "cpp",  ## use 'tm_cpp' if you don't want BacktrackingTM
    minVal = min(my_dataframe[my_fieldname]),
    maxVal = max(my_dataframe[my_fieldname]),

To see the encoder params you can do this:

`encodersDict = (MODEL_PARAMS["modelConfig"]["modelParams"]["sensorParams"]["encoders"])`
 for field, paramVals in encodersDict.items():
    print('Field = {} --> ParamVals = {}'.format(field,paramVals))

Once you fully understand this sub-dict I’m calling ‘encodersDict’, you can modify it to use your field-name, rather than the default ‘c1’. You could also add more fields to make it a multivariate model, though I’d recommend using separate univariate models for a couple reasons:

It’ll give each model as clear a signal as possible
You can see which specific fields are showing anomalous behavior, which you can’t do if you put them all into 1 big model. This also give you the flexibility to define a ‘system anomaly’ however you like. I current say if more than __% of fields are anomalous within __ timespan then the multivariate system as a whole is being anomalous.

Also to address your original question about model.enableInference only taking one field, this is because currently even multivariate NuPIC models can only forecast one field at a time. Just because you’re only forecasting one field doesn’t mean the model is only learning from that field.

breznak · May 13, 2019, 2:44pm

Hi, just to clarify, I think people are mixing things up here.

For anomaly detection: you don’t need the decodings of the predicted patterns. Hence you can just fine encode miltiple features into a singe HTM region (using MultiEncoder, as Marty suggested). HTM will create its representations and predictions by iteself from which you derive the anomaly score.

For “decoding” aka predicting the values back in “human”/original format, you need a so called Classifier. It would be possible to predict from multiple features at once, as Matt says probably the implementation does not support that.

(@dmac does our reworked SDRClassifer work on multiple features? )

sheiser1 · May 13, 2019, 2:58pm

Right, the anomaly scores are not effected by which feature is set as ‘predictedField’, though I think you still need to set a predictedField for the params.

dmac · May 13, 2019, 5:19pm

Use multiple classifiers.

1113 · May 15, 2019, 5:35am

thank you very much for your answer!

Topic		Replies	Views
Doubt: Predicted field, anomaly Likelihood and multiple inputs NuPIC	13	1899	March 30, 2017
Anomaly detection from multiple variable NuPIC question	5	1690	August 11, 2017
Anomaly Detection - Hot Gym Tutorial : How to consider several variables? NuPIC	4	1442	November 2, 2018
I have some realtime server data with total of 17 feature(column) in my dataset NuPIC question	5	535	December 11, 2018
Generating corelation between input variables NuPIC	3	507	February 15, 2017

Anomaly detection for multi features

Related topics