Hi all,
I’m confusing with the timestamps in OPF.
I have data with time interval of 1 min.
But there are some point where the date jumps.
(This means that data collection failed for few days and can happen in real world.)
timestamp, other fields exist but not shown here
2016-08-16 20:54:00
2016-08-16 20:55:00
2016-08-16 20:56:00
2016-08-16 20:57:00
2016-08-16 20:58:00
2016-08-16 20:59:00
2016-08-16 21:00:00
2016-08-16 21:01:00
2016-08-16 21:02:00
2016-08-16 21:03:00
2016-09-12 00:01:00
2016-09-12 00:02:00
2016-09-12 00:03:00
2016-09-12 00:04:00
2016-09-12 00:05:00
2016-09-12 00:06:00
I used this data I ran OPF (anomaly detection and 1-step prediction) and got anomaly scores and predicted values.
1) Predicted value
In this case, what is the meaning of the next value predicted based on the values upon 8/16/2016 21:03:00?
I guess it’s value predicted for 8/16/2016 21:04:00, which is not that I wanted to get.
Then, what about the predicted value based on the values upon 2016-09-12 00:01:00? Is it for 2016-09-12 00:02:00.
Is my understanding right?
2) Reset
To my understanding, I can think of using “reset” in the 2016-09-12 00:01:00.
Does “reset” lose information learned before 2016-09-12 00:01:00 and begin whole new learning?
Or, just ignore transition from 2016-08-16 24:04:00 to 2016-09-12 00:01:00 and resume learning?
Predicted values are the same no matter calling model.resetSequenceStates() function in the begining of 10/13 data but raw anomalyScore and anomalyLikelihood changed like below.
Result with reset called on 2016-10-13 11:47
timestamp actualValue PredictedValue rawAnomalyScore anomalyLikelihood
2016-10-10 15:36 6.3 6.279269193 0.025 0.840672164
2016-10-10 15:37 6.3 6.285488435 0.025 0.539804591
2016-10-10 15:38 6.1 6.289841904 0 0.651632888
2016-10-10 15:39 6.3 6.292889333 0.025 0.651632888
2016-10-10 15:40 6.3 6.235022533 0.025 0.651632888
2016-10-13 11:47 7.2 6.254515773 1 0.730254637
2016-10-13 11:48 7.4 6.268161041 0.875 0.931660019
2016-10-13 11:49 7.2 6.547712729 0.25 0.952377002
2016-10-13 11:50 7.4 6.80339891 0 0.931660019
Result without reset
timestamp actualValue PredictedValue rawAnomalyScore anomalyLikelihood
2016-10-10 15:36 6.3 6.279269193 0.075 0.605100109
2016-10-10 15:37 6.3 6.285488435 0.025 0.503793428
2016-10-10 15:38 6.1 6.289841904 0.025 0.616145098
2016-10-10 15:39 6.3 6.292889333 0.025 0.605454426
2016-10-10 15:40 6.3 6.235022533 0 0.616145098
2016-10-13 11:47 7.2 6.254515773 0.625 0.656029388
2016-10-13 11:48 7.4 6.268161041 0.3 0.742986019
2016-10-13 11:49 7.2 6.547712729 0.275 0.809383258
2016-10-13 11:50 7.4 6.80339891 0 0.733915274
In this kind of analysis, which one should I take?
And I’m curious why the predicted value did not change at all.
3) Aggregation in swarming
What if I want to aggregate data into 5 min blocks?
Does NUPIC automatically do this for only those data having consecutive timestamp or collapse any five consecutive records no matter the actual timestamp (e.g. mix of data from 8/16 and 9/12)?
Thank you in advance.