Hello NuPIC,
I’m aware of the recommended model parameters for temporal anomaly detection on scalar streaming data, mentioned here Don’t swarm for Anomaly models (Thanks Matt).
But I have a dataset that looks like this:
| timestamp | s0 | s1 | s2 | ... | s29 | label |
------------------------------------------------------------
| 2016-02-01 8:00:00 | 0 | 0 | 0 | ... | 0 | 0 |
| 2016-02-01 8:00:01 | 1 | 0 | 0 | ... | 0 | 0 |
| 2016-02-01 8:00:02 | 1 | 1 | 0 | ... | 1 | 1 |
| 2016-02-01 8:00:03 | 1 | 0 | 0 | ... | 0 | 0 |
So it’s basically a timestamp field, a 30 binary sensors readings, and a label that has two values (0: normal, 1:anomaly).
Is the swarm utility provided with NuPIC capable of generating good model parameters for this kind of dataset? if so, what things I need to do to ensure the swarm process produces good results. (I know of using "swarmSize": "large"
, for example).
Thanks!
1 Like
I think you should create your own encoder for this data. It seems very well aligned to be straight binary data. You could even re-use the DateEncoder
for part of the encoding and just concat all the 1
s or 0
s for each field after it.
Hey Matt! Thanks for taking the time time to reply.
I think you should create your own encoder for this data. It seems very well aligned to be straight binary data. You could even re-use the DateEncoder for part of the encoding and just concat all the 1s or 0s for each field after it.
Yes I’ve been actually doing just that and it’s producing good results. But my questions is about the optimal SP and TM parameters for my model. Will the swarm utility in NuPIC help with this or should I just go with trial and error for the model parameters?
The parameters we use for anomaly models should be a good starting point.
But it could also depend on the dimensions of the input space you are creating with your encoder. Based on your data it should be pretty small. Once you have something that runs, why don’t you print out the model params you’re using and paste them here so we can discuss.