HTM loss its prediction ability

@scott; @rhyolight
Hello, two good men, I am continuing working on this anomaly detection:

(threshold =0.999)
The data I used just the nyc_taxi.csv with some artificial anomalies. the normal data patterns are just similar to the pattern in the green cycle. And the artificial anomalies are just in the red cycle. This time, What puzzles me most is the HTM fail to give reasonable predictions, instead, just following the raw data’s steps.
From where I stand, this is a seriously anomaly for HTM, as the ground truth that HTM has the ability of anomaly detection is just based on its prediction ability. Therefore, the loss of ability of prediction no doubt leads to fail to detect anomaly (in red cycle, especially in orange cycle.)

as for the program I used just modify the code for one_gym example, in which I change the range of data, and the timeofDay radius( to 0.5).
To be honest, I love HTM, as its base thoughts are beautiful and elegant. However, I couldn’t apply it to my project if its ability loses at such an extreme and obvious anomaly situation. In my heart, I have great confidence in HTM’s theory, so please help me to find out the reason of such an abnormal failure.

1 Like

HTM does this when it doesn’t have enough information to make a prediction. It will take time to recover from the anomaly occurring before good predictions start coming again.

You might prevent this if you turn off learning before you start getting anomalies. That way it will not be perturbed by anomalous data and try to learn it. It might help.

Looking just at the anomaly score, it looks much higher and more eratic now, which is what we were hoping for before. I’m not sure this is a good thing, but it seems you were at least able to affect it with some of the previous suggestions.

Now if we consider the anomaly likelihood, it attempts to find how unusual the recent values are compared to some historical window. This might be problematic because this data is dominated by artificial anomalies. So the algorithm is probably thinking that this frequently changing data is the norm, and hence only very extreme deviations would be above the likelihood threshold. It would probably help to have 1000+ records of “normal” behavior before you start introducing anomalies. And if you introduce too many anomalies in a short period then the definition of “unusual” becomes harder and harder to achieve. Just something to keep in mind. Our likelihood is designed explicitly to handle very noisy datasets, and if 50% of a dataset is anomalies then it just starts to look like noise.

That said, your results are pretty surprising to me and I would have thought there would be more prominent anomalies discovered. But perhaps you just need to make sure that you don’t create too many artificial anomalies too close together.

Hi, Taylor, thanks. Yes, I got what you said.
Actually, This picture just a glimpse of a piece of the whole data, before these anomalies , there are about 4000 normal records, which I thought are enough for HTM to learn, at least enough to tell the extreme anomalies from normal.
Therefore, what frustrated me most is that HTM loss its prediction ability even at the beginning of the extreme anomalies.
Yeah, I will try your advice, separate the training from the testing phases, and to test its effect. To be honest , I am a little negative to this try as the reason behind couldn’t match this way.
Thanks again, Taylor.

Hello, Scott, thanks for your response.
as for the picture, I just plot the part of the data with artificial anomalies. In effect, before the data illustrated, there are about 4000 normal records. Given this, the HTM, from where I stand, would learning to be a good anomaly detector, at least could recognize such an extreme anomalies.

The pattern of the anomaly I introduced is just to emulate the example (one gym) Taylor performed. In that example, He delete the data in Tuesday of the October. well, HTM act perfect to detect this anomaly.

What hurts me most is HTM should fail to predict even at the beginning of the extreme anomaly data. It shouldn’t as this pattern just appears, i.e., in HTM’s memory, there are no pattern even similar to this pattern.
Yeah, I will follow your advice to introduce anomalous data in a small quantity.
Thanks again Scott.

I don’t think you should be looking at the prediction curve too much when you’re doing anomaly detection. There’s several predictions being made and it only plots the most likely. When it encounters an anomalous event, the prediction of a nominal curve will not appear because it is no longer a possibility. So the prediction will closely follow the actual data no matter what unless the predicted value is a huge difference.

Focus mostly on the anomaly score since that indicates the presence of anomalies and failed predictions. You can see from your data that the anomaly scores are really high for your anomalous section of data. They return to sane levels once it returns to nominal data.

Train your model on lots of nominal data and show us what the anomaly scores look like for nominal inputs. They should be low or at least manageable.

1 Like

Hi, Jacob, Thank you for your kind and terrific response.
from what you said,I got the messages:

  1. The HTM as often the case fails to predict when anomaly occur, because the likelihoods of its predictions are ridiculous then abandoned, instead of the real real data at that time.

Do I get the point? I refer to the source code, there indeed are so called ‘best prediction’ but I couldn’t find the logic of mimicking raw data, and I wonder how the likelihoods of prediction( not the likelihood value of anomaly) are calculated ? Does the process of calculation take advantage of the real data in advance? Still, I have no idea where to get these information in the source code.

  1. The data model actually outputs right results, it would be better to put more attention on analyzing anomaly score

In my opinion, the curve of anomaly score should keep very high on the appearance of anomalous data, till HTM has leant the pattern. However, this type of case never comes out, what’ more, at some time( An obvious type of Anomalies was not detected by HTM), the curve remains placid amazingly.

  1. Comparing with abnormal data, the curve of anomaly score should be meek, peaceable and somewhat tractable.

(Ps: the axises (xlim[0,400], ylim[0,28000],in this image are little different from those(xlim[0,800],ylim[-10000,60000] in the first picture. this difference has no effect on the results but may affect the appearances of the curves.)

The image above gives a glimpse of the normal data(about 3500 normal records before the data portrayed in this picture). To be honest, I could not figure out why the curve of anomaly score arises spikes at some places which seem to be normal.

–Pegasus

What are your inputs? Do you have a time input? Some of those anomalies could be differences in the timing of the rising or falling curves. It’s also clear in the first 3 anomaly spikes that there are differences in the preceding curves that would precipitate those anomaly scores.

Try looping this data set over and over about 20 times and you should see the anomaly scores zero out. This would make a more robust “nominal” data model and should make it more sensitive to the actual anomalies you are interested in.

1 Like

Hi, Jacob.
yeah, I have a time input. and the data format is like:

c0,c1
datetime,int
T,
2014-07-01 00:00:00,10844
2014-07-01 00:30:00,8127
2014-07-01 01:00:00,6210
2014-07-01 01:30:00,4656
2014-07-01 02:00:00,3820
2014-07-01 02:30:00,2873
2014-07-01 03:00:00,2369
2014-07-01 03:30:00,2064

thanks for your considerate advice, I will try in this way.

Yes, but how are you encoding it?

Hi, Jacob,
to datetime : timeOfDay = [21,0.5], because records in this data come every half an hour, ( the second parameters for timeOfDay is radius, so the philosophy I choose this param is just according to the date’s granularity. am I right?)
to value(c1): minVal, maxVal = 0, 28000

Yes, but only if the time of day is meaningful to the data. Do the cycles occur with the half hours?

You can see that the first two anomaly spikes correspond to rightward drifts of the waveform. However, if the waveforms are not synchronized to the time cycles, then you should probably not use that as an input.

Well, you could, but you would have to train a LOT more data to account for all possible phase offsets.

You can just drop the time as input if there is no periodic correlation with the time. You should get better results as long as the input samples are evenly spaced at a constant 0.5 hours as you say.

Hi, Jacob,
Thanks, Yes, indeed, the time is meaningful to the data. We can easily find the periodic waveform from the image, In effect, one cycle is just a day, as the records generated every half an hour, so 48 records together make up one cycle.

Thanks for your patient and detailed response, thanks sincerely.
I have learned much from your answers. But I am still in puzzle about the question below, if you will give me a hand, I will appreciate it greatly.

Thanks.

I actually don’t know how predictions and their likelihoods are calculated. I’m not familiar with that part of the code.

See this video for some info:

This uses the old CLAClassifier (we now use the SDRClassifier) but the ideas are the same.

Oh, Yeah,~ thanks, Taylor, I will watch it immediately.

Hey @Pegasus you might try running HTM Studio on your data. It is running NuPIC under the hood, and it is tuned to do anomaly detection on scalar data streams.

Hi, @rhyolight, Thanks for your advice, I will try it later. :grinning: