Don't swarm for Anomaly models

Continuing the discussion from Bad Anomaly detection for complex periods data:

This keeps coming up in different ways, so I just want to clarify. If you want to get temporal anomalies from streams of scalar data, you do not need to run a swarm. We already have model parameters you can use the are generally good at detecting scalar anomalies. You can get them by calling this function with the right parameters:

Here is an example case of them being used for NAB (thanks @alavin):

There you go! No swarming, no fuss digging around for previously used parameters.

4 Likes

Hi!
The model which is returned from getScalarMetricWithTimeOfDayAnomalyParams() is nice but has
ā€™timestamp_dayOfWeekā€™ = None and
ā€™timestamp_weekendā€™ = None.
I want NuPIC to be aware of weekdays and weekends. What values should I use?
ā€¦
E.g. here I found an example with
"dayOfWeek": [21, 5.01],
Where do the numbers 21 and 5.01 come from?

Thanks
Christian

Those values are ā€œwidthā€ and ā€œradiusā€:

You can see exactly what is happening with those values in the date encoder:

As you can see above, the width and radius are used directly to create a periodic scalar encoder. See also:

Thanks Matt for linking those useful pieces of info! Also the ā€¦encoders/scalar.py had some good pedagogical comments in it. /Christian

1 Like

Hi Talyor ,
I am a newbie to htm, so there are some thing confused me:

  1. whatā€™s difference between ā€˜tmImplementation = ā€œcppā€ā€™ and ā€˜tmImplementation = ā€œcppā€ā€™? which is better for anomaly detection? my data is minutely~ thanks.

  2. the function you mentioned: ā€˜getScalarMetricWithTimeOfDayAnomalyParamsā€™, if this function could be used for minutely data?

Recently, I am very interested in htm, especially on AD( anomaly detection). However as the complexity of this algorithms and itsā€™ numerous parameters, I could only make very little progress.
Yesterday, I tried Swarm. Ironically, today, I find the topicā€¦:sweat_smile:

  1. What if I want to encode timestamp with granularity of minute?

ehā€¦ the message was written several times a day,:joy:

  1. just a moment ago, I ran example codes in the project nupic.workshop . However ,there are some errors, such as ModelFactory received unsupported Model type: CLA, No module named modelfactory. from the type of errors, I wonder if there codes in wokshop are out of date?
    I noticed, the date updated is a year ago, comparing with other resources or projects of htm, it could be seen as a more recent version. Soā€¦ why?

  2. I saw the details in model params, I found no matter how fine the granularity of time ( daily, hourly or minutely) the ā€˜timestamp_timeOfDayā€™ dict are the sameā€¦ is it means the encoding for time have no or little influence on AD or prediction?

  3. nupic or htm is not that kind for a newbie, especially a man in China :joy: . Actually, All I want is that the detail procedure (which is up to date ) to deal with data ( time: monently(like 2017-07-08 09:08:00, value: scalar) for anamoly detection.

  4. and There are further questions,

  • what if i want to turn off ā€˜learningā€™ when i think the model are trained well enough to meet I need?
  • What if I want to continue train the model from where i stop ā€˜learningā€™?
  • what if i want manually to label a anomaly which is not seen as anomaly points by htm?
  1. In the docs for AD, there provides a url: https://github.com/subutai/nupic.subutai/tree/master/swarm_examples
    with the sentence: ā€˜swarming with anomaly detection modelsā€™. So I donā€™t knowā€¦ Iā€¦ I am confusedā€¦

  2. ehā€¦ I used the func: getScalarMetricWithTimeOfDayAnomalyParams
    and change the data head to (ā€˜c0ā€™,ā€˜c1ā€™)
    but there stil has the error:
    ValueError: Unknown field name 'c0' in input record. Known fields are 'c1'. This could be because input headers are mislabeled, or because input data rows do not contain a value for 'c0'. I use nyc_taxi.csv for the first textā€¦
    I suspect this results from type for ā€˜c0ā€™( expect datetime) , so I change it to datetime type, however, the error change to this:
    ValueError: Unknown field name 'c1' in input record. Known fields are 'c0'. This could be because input headers are mislabeled, or because input data rows do not contain a value for 'c1'.

    > #!/usr/bin/env python2
    > # -*- coding: utf-8 -*-
    > """
    > Created on Fri Jul 21 11:17:35 2017

    > @author: pegasus
    > """


    > from nupic.frameworks.opf.model_factory import ModelFactory
    > from nupic.frameworks.opf.common_models.cluster_params import (
    >   getScalarMetricWithTimeOfDayAnomalyParams)
    > import nupic_output
    > import shutil
    > import csv
    > import numpy as np
    > import time,datetime


    > def run_experiment(PLOT=True):

    >      params = getScalarMetricWithTimeOfDayAnomalyParams(
    >      metricData=[0],
    >      tmImplementation="cpp",
    >      minVal=8,
    >      maxVal=39197)

    >      model = ModelFactory.create(modelConfig=params["modelConfig"])
    >      model.enableLearning()
    >      model.enableInference(params["inferenceArgs"])
    >     

    >     
    >     if PLOT:
    >             output = nupic_output.NuPICPlotOutput("nyc_output", show_anomaly_score=True)
    >     else:
    >         output = nupic_output.NuPICFileOutput("nyc_output", show_anomaly_score=True)
    >     #output = nupic_output.NuPICFileOutput('sine_out')
    >     with open('nyc_taxi.csv','rb') as sine_input:
    >         
    >         csv_reader = csv.reader(sine_input)
    >         
    >         #skip headers
    >         csv_reader.next()
    >         csv_reader.next()
    >         csv_reader.next()
    >         
    >         # data
    >         for row in csv_reader:
    >             a = time.strptime(row[0],'%Y-%m-%d %H:%M:%S')
    >             angle= datetime.datetime(*a[:3])
    >             sine_value = int(row[1])
    >             date = model.run({'c0':angle})
    >             result = model.run({'c1':sine_value})
    >             print(result.inferences)
    >             output.write(angle, date, sine_value, result, prediction_step=1)
    >     output.close()
    >             
    > if __name__ == '__main__':
    >     
    >     run_experiment(PLOT=False)

Sorry, so many questions are put, I like to the thoughts inside the htm, so i have worked on this model for about month: from reading papers, watching videos, run demos, now to ran example on my own data. But there are so many problems i couldnā€™t solve all by myself.

I think you mean py vs cpp. We have versions of HTM algorithms in both python and C++. This specifies with version you want to use. C++ is faster. They both will return exactly the same results.

Yes.

We recently introduced breaking changes to NuPIC. Those are the errors youā€™re seeing. Thanks for pointing this out, weā€™ll be fixing this example codebase soon.

If you have daily data points, there should be no time-of-day encoding included. The params returned by getScalarMetricWithTimeOfDayAnomalyParams wonā€™t handle daily inputs well. Hourly or minutely, sure. Best would be every 10-15 minutes.

That is a good idea if you want the model to lock into recently seen patterns and not learn anything new.

You just re-enable learning again.

Youā€™ll have to do this outside of NuPIC.

Youā€™ll have to explain what you are confused about. Swarming returns model params that could be used for prediction or anomaly detection. So the statement makes sense to me.

You will either need to change these labels to match the names of of your input file field names, or change the input file field names to match the model params.

Hi, Taylor, it is the best thing of the day to receive your response at early morning (in China). Thanks, you really help me a lot.
as for the question:

whatā€™s difference between ā€˜tmImplementation = ā€œcppā€ā€™ and ā€˜tmImplementation = ā€œcppā€ā€™? which is better for anomaly detection? my data is minutely~ thanks.

I have to say sorry, I make a mistake: the right will be: the difference between ā€˜cppā€™ and ā€˜tm_cppā€™.

as for:

what if i want to turn off ā€˜learningā€™ when i think the model are trained well enough to meet I need?

What if I want to continue train the model from where i stop ā€˜learningā€™?

what I really want to express is how to turn on/off when model is runing on real time environment? i.e., what should i do when i want to continue previous training at the point i paused the model, or how to save the parameters got from training and training with the last saved parameters as initial state?

what if i want manually to label a anomaly which is not seen as anomaly points by htm?

I meant if user (like me) could serve as an assistant to help htm model, i.e., when a type of anomalies was not seen as anomalies, then i ā€˜tellā€™ htm model these are anomalies, therefore htm could recognize them when they occur next time. or a batch of normal data was mistaken as anomalies by htm, then i ā€˜tellā€™ htm these are normal, therefore, when this type of data appears, htm could regard them as normal data. could this be possible, and howā€¦

In the docs for AD, there provides a url: https://github.com/subutai/nupic.subutai/tree/master/swarm_examples

with the sentence: ā€˜swarming with anomaly detection modelsā€™. So I donā€™t knowā€¦ Iā€¦ I am confusedā€¦

> Youā€™ll have to explain what you are confused about. Swarming returns model params that could be used for prediction or anomaly detection. So the statement makes sense to me.

what confused me is that in docs it suggests us to utilize swarm, but here in this topic(ā€˜donā€™t swarm for anomaliy modelā€™) it tells us do not take advantage of swarming? So I wonder which i should follow?

You will either need to change these labels to match the names of of your input file field names, or change the input file field names to match the model params.

Yes, I the default field names in model params is ā€˜c0ā€™ and ā€˜c1ā€™, i changed my input fileā€™s fields to these two names. then it raise these errors.

Thank you again :smile: . you really do a good job :+1:. It is you ( with the papers) that really makes me know about htm which attracts me very much at the moment.

You probably want cpp: http://nupic.docs.numenta.org/1.0.0/quick-start/example-model-params.html

See serialization and the algorithms all have functions to enable and disable learning (see the api docs).

There is no way to do this with NuPIC currently.

To run HTM, you must create instances of HTM algorithms (spatial pooling, temporal memory). These instances must be configured (they are highly configurable). There are millions of permutations of potential configurations for a given data set to try. Swarming attempts to find a set of decent model parameters (including data encoding details) instead of making the user manually try lots of different settings.

We have found a set of parameters for the SP and TM that work well for all single-field scalar data, so you donā€™t have to swarm if you want to do anomaly detection on single-field scalar streams.

I donā€™t know what is going on here. This has always worked for me in the past. Those errors donā€™t seem to make sense if youā€™ve changed the headers. Can you double check the model parameters and your input file? Make sure the encoder names in the model params match your input file header. If you are manually processing the input file, you will need to change your code to use the correct header name there if you are not reading them from the file.

:smiley: Thank you very much for saying so. Iā€™m happy I can help you. This can be hard to get running but you can do some interesting things with it once you have worked for awhile.

Hi, Taylor, thanks for your patient answering questions. I have learned a lot from your response. OK, I will try more. Thanks again.:blush:

Hi Matt

Thank you for such detailed explaination of the model. The model works good, but just curious, what should I adjust to get result.inferences[ā€œmultiStepBestPredictionsā€][1]

Thanks a lot!

Have a good day!

1 Like

Each of these dictionaries should have a key corresponding to the steps ahead for each prediction. In this example, we are retrieving predictions for both 1 and 5 steps ahead (which was defined in the Model Parameters).

From: http://nupic.docs.numenta.org/stable/quick-start/opf.html#model-parameters

Specifically, in the classifier parameters, you can change the number of steps:

  clParams:
    verbosity: 0
    regionName: SDRClassifierRegion
    alpha: 0.1
    steps: '1,5'
    maxCategoryCount: 1000
    implementation: cpp

hi @rhyolight,

in the exemple numenta_detector.py and numentaTM_detector.py why the model params and the spatial pooler are not exist .
and I want to detect anomaly in ECG so I must use the getScalarMetricWithTimeOfDayAnomalyParams or the scalar Encoder ??

If you use getScalarMetricWithTimeOfDayAnomalyParams(), you are using the OPF, which instantiates the encoders for you. All you need to do is set up some model params in a dict nicely. Thatā€™s what the function does. It gives you a configuration that includes scalar encoding parameters.

Hello everybody,

I am a little bit confused. @rhyolight you said:

According to this paper: " Unsupervised real-time anomaly detection for streaming data"
https://www.sciencedirect.com/science/article/pii/S0925231217309864

we first calculate the prediction error and based on that the anomaly likelihood.
So my understanding is the following:
to find a good anomaly likelihood I need a good prediction first.

So why do I swarm to have a good prediction, but not for a good anomaly detection (which is based on that prediction)?

Thanks a lot for your help. :blush:

I think thereā€™s a confusing overlap of terms going on here. By ā€˜prediction errorā€™ in that paper theyā€™re actually referring to anomaly score (please correct me if Iā€™m wrong @rhyolight, @subutai, @Scott, @mrcslws). The anomaly score is the TMā€™s metric for proportion of unpredicted activity in one time step, always ranging from 0 to 1.

If in forecasting mode, the TM also yields a more traditional prediction error (like MAPE), measuring the distance between model-forecasted X1(t) and the observed X1(t). Its this forecasting-oriented prediction error thatā€™s being optimized for by swarming.

This forecasting-oriented prediction error is totally different than the anomaly score ā€œprediction errorā€ from the paper. So HTM anomaly detection is based on prediction in a broad sense, but by only anomaly-oriented prediction error (its native anomaly score), not the forecasting-oriented prediction error (measured in universal terms like MAPE).

1 Like

sureā€¦
So letā€™s forget about the words and letā€™s look at the math:
[I am sorryā€¦I donā€™t know how to do formulas here, but Iā€™ll try my best]

I am comparing the ā€œunion of each individual predictionā€ (the Ļ€(x_{tāˆ’1}) ) with the encoding of the current input and dividing by number of On bits in the current input.
So (in my opinion) the performance of the prediction is essential for the value of the s_t (which is the prediction score or anomaly score).

say I have very bad prediction performanceā€¦so my Ļ€ does not fit the current input. Than my anomaly score is closer to 1 (or even 1 if my predictions are totally wrong).
because of this my Ī¼_t will be close to one and accordingly my Ļƒ^2 close to zero.
As result ~Ī¼_t will be close to 1 as well.

So finally I will ALWAYS have a Anomaly Likelihood, which is super close to 1.

I hope you understand what i meanā€¦
That brings me back to my statement: when I have a bad prediction, Iā€™ll have a bad Anomaly Detection (Iā€™ll find a lot of anomalies even when the behavior of my data is normal).

So back to my question:

I am super sorry, when all this is totally easy and my calculation and understanding is just wrong :roll_eyes: ā€¦ but I would really appreciate an answer to that or maybe a link/paper where I can read further to that.

Thank you all very much in advance :blush:

This sounds right, so just to ensure weā€™re on the same page (correct me if Iā€™m off): there are a set of predicted columns from (t-1), which you represent as:

Then there are a set of activated columns at (t) ā€“ also known as SP bits, out of standard 2048 total. So the anomaly score, which you refer to as:

is equal to the proportion of currently active SP columns which were not in the set of predicted columns ā†’ Ļ€(x_{tāˆ’1})

Iā€™m almost sure this means that the Anomaly Likelihood isnā€™t being calculated correctly. If most of your Anomaly Scores are close to 1 with a low standard dev, then another Anomaly Score of 1 should have a low Anomaly Likelihood, since its still within the tight deviation.

If you think of a distribution of recent Anomaly Scores (say 500 for instance), that distribution will have a Mean and Standard Dev. If that Mean is high (say 0.95) and SD is low (say 0.05) then an influx of any low Anomaly Score values would yield high Anomaly Likelihoods, since they would be more dissimilar from that longer term distribution. However an influx of more high Anomaly Scores should yield low Anomaly Likelihoods, since they would be more similar to that distribution.

Are you using the Anomaly Likelihood module to get your scores or custom code?

I donā€™t think this should be happening!

Hereā€™s a link to Scotts presentation on Anomaly Detection, I think it may be informative:

Hereā€™s a link to a presentation on swarming:

1 Like