Anomaly Detection - Hot Gym Tutorial : How to consider several variables?


#1

Hi!

I want to do Anomaly Detection on a timeserie dataset which have multiple variables.

As some of those variables are categorical, I am not using HTM Studio nor the NAB, but I am following the Hot Gym tutorial for anomaly detection.

I got the code given in nupic/examples/opf/clients/hotgym. It appeared that only two fields can be given: the timestamp and one other variable -the “predictedField”- (in the case of the Hot Gym it is the kw_energy_consumption).

I modified the code so that the timestamp becomes an option (as I don’t always want to consider it), and now I’d like to enable the program to take several variables at once (not only one).

Is this possible to do that “easily”?

I hope somebody will be able to help!


#2

I’m not a NuPIC guy. But I’ve been playing around with HTM.Java the Java’s NuPIC version and it is simple to do what you want. You have to setup your sensor encoders to read each variable you want to take into acount and than add all the predictive fields you want to the classifyer.

For example: if you have multi variable time series as following: TS(a timestamp), A(a number), B(a string labeled category) and C(another number) and you want to predict fields A and B

You setup your fields like:
filed TS, fieldName=“TS” encoderType=DateEncoder, dateFiledPattern=“MM/dd/YYYY HH:mm:ss”, and timeofdaywith w=21
filed A, fieldName=“A” , encoderType=ScalarEncoder or RandomDistributedScalarEncoder or DeltaEncoder, minVal=minimal value for B field, maxVal=maximal value for CBfield, and n= 126 and w=21
filed B, fieldName=“B” , encoderType=(CategoryEncoder or SDRCategoryEncoder), categoryList=[each possible category] and w=21
filed C, fieldName=“C” , encoderType=ScalarEncoder or RandomDistributedScalarEncoder or DeltaEncoder, minVal=minimal value for C field, maxVal=maximal value for C field, n= 126 and w=21

Setup your file sensor header like:
“TS,A,B,C”
“datetime, float, string, float”
“T,”

Then you add your add your predicted fields to the classifier in HTM.Java it is done like this:
p.set(KEY.INFERRED_FIELDS, getInferredFieldsMap(SDRClassifier.class));
Where each inferred field is mapped to its respective classifier.

I hope this helps


#3

Thank you for your answer, I will look at it !

As for the Python Hot Gym, I was thinking of using the MultiEncoder that could concatenate (if I’m not wrong…) the variables.
I indicated it in the model_params.py. For the MultiEncoder input ‘encoderDefinitions’ I put in a dict of dict of the variables I wanted to concatenate and the encoders to use for each variable (and their parameters).
.But I didn’t manage to make it work.
If somebody has an idea…


#4

Hi @chacd. Yes, you can use the MultiEncoder to easily combine encodings. You could also run one model per variable. This example usage might help you:


#5

Thanks!

i also read an other post and it helped: