Swarming for anomaly detection with multi-column datasets

designernasser · December 27, 2016, 6:19pm

Hello NuPIC,

I’m aware of the recommended model parameters for temporal anomaly detection on scalar streaming data, mentioned here Don’t swarm for Anomaly models (Thanks Matt).

But I have a dataset that looks like this:

| timestamp          | s0 | s1 | s2 | ... | s29 |    label |
------------------------------------------------------------
| 2016-02-01 8:00:00 |  0 |  0 |  0 | ... |   0 |        0 |
| 2016-02-01 8:00:01 |  1 |  0 |  0 | ... |   0 |        0 |
| 2016-02-01 8:00:02 |  1 |  1 |  0 | ... |   1 |        1 |
| 2016-02-01 8:00:03 |  1 |  0 |  0 | ... |   0 |        0 |

So it’s basically a timestamp field, a 30 binary sensors readings, and a label that has two values (0: normal, 1:anomaly).

Is the swarm utility provided with NuPIC capable of generating good model parameters for this kind of dataset? if so, what things I need to do to ensure the swarm process produces good results. (I know of using "swarmSize": "large", for example).

Thanks!

rhyolight · January 5, 2017, 5:36pm

I think you should create your own encoder for this data. It seems very well aligned to be straight binary data. You could even re-use the DateEncoder for part of the encoding and just concat all the 1s or 0s for each field after it.

designernasser · January 5, 2017, 6:47pm

Hey Matt! Thanks for taking the time time to reply.

I think you should create your own encoder for this data. It seems very well aligned to be straight binary data. You could even re-use the DateEncoder for part of the encoding and just concat all the 1s or 0s for each field after it.

Yes I’ve been actually doing just that and it’s producing good results. But my questions is about the optimal SP and TM parameters for my model. Will the swarm utility in NuPIC help with this or should I just go with trial and error for the model parameters?

rhyolight · January 5, 2017, 7:31pm

The parameters we use for anomaly models should be a good starting point.

But it could also depend on the dimensions of the input space you are creating with your encoder. Based on your data it should be pretty small. Once you have something that runs, why don’t you print out the model params you’re using and paste them here so we can discuss.

Topic		Replies	Views
Don't swarm for Anomaly models NuPIC swarming , anomaly-detection	16	5308	October 17, 2019
Swarming on anomaly detection mission NuPIC	2	729	August 5, 2019
Why am I seeing lot of false positives? NuPIC	12	2484	June 22, 2016
Understanding NuPIC and troubleshooting to get the best results NuPIC	2	1498	July 20, 2016
Bad Anomaly detection for complex periods data NuPIC usage-help , anomaly-detection	4	2054	October 10, 2019

Swarming for anomaly detection with multi-column datasets

Related topics