Bad results anomaly detection


#1

hi,
I run my code of anomaly detection in ECG signal but , it give me a bad results but it’s ok I will update my code , and I will try to use numenta_detector and use getScalarMetricWithTimeOfDayAnomalyParams() (like NAB) , I guess when I use those two metrics my code will give me a good results,but the problem when I change some parameter (just to know why my result bad like that :sob: ) , the result are still the same , detect 2 anomaly and I have 4 anomaly ,
the parameter that I changed (learning Period,historic Window Size ,estimation Samples,estimation Period=100) I change it many and many time but the same result , I read some papers using author ML and I find some paperes use all the signe ECG and some one not, they cutting the ECG and use each wave (P,QRS,T).
so I got confused .
tttt
I have some questions:

  1. HTM is supervised anomaly detection that mean the data are not labelled , why in some exemple I find HTM use a dataset with labeled data.

  2. in the anomaly detection , the training is with (normal , abnormal or both) dataset

  3. and the important for me ,
    all of us we know that the signal ECG it has 3 waveforms (P, QRS and T waves)
    so did i create a encoder just for all the signal ECG(like I did) and then (training,predict…) or cutting my ECG signal and create a encoder for each wave , because maybe a data considered an anomaly in p wave but is not in the QRS wave (they don’t have the same features)


#2

Let me try my best.

  1. HTM by itself is unsupervised. The Spatial Pooler and Themporal Pooler learns on themselves. But the classifction algorithm (KNN, NN, SVM, SDR classifier) that converts SDR into categories are supervised.
    That’s not saying anomaly detection is supervised tho. Anomaly detection in HTM is unsupervised.

  2. Train with the normal dataset. TM generates anomaly signal when seeing unknown patterns. So feed the TM with normal signals and it will tell you when a abnormal one is present.

  3. I’ll say try both. There is no correct way to do it. Ideally HTM should learn the entire pattern of a ECG wave and thus capable of finding anomalies anywhere in the wave. But its not a bad idea to give HTM some help.

BTW, can you share your code? It is difficult to know what is going on without your code and results.


#3

Train with the normal dataset

me too first I thought the training is by using only the normal data ( like one class SVM ), but some researchers said the training of unupervised ML is by using a data set contain normal and abnormal data

I’ll say try both. There is no correct way to do it. Ideally HTM should learn the entire pattern of a ECG wave and thus capable of finding anomalies anywhere in the wave. But its not a bad idea to give HTM some help.

sorry but I don’t have much time to try both this why I’m asking

the parameter that I changed (learning Period,historic Window Size ,estimation Samples,estimation Period=100) I change it many and many time but the same result ,

like I said before I know my code is give me a bad result and know I will make some updates ( I trying to use numenta detector) but the problem is why I make some changes in parametre the result still the same

3 the problem is I don’t have ECG data set , for this moment I using this data set
and this data set is used by @rhyolight and he use all the signal ECG (like I did) but this data set is (normal + abnormal data).
I use only the 3600 record of this data and this is my result


in the picture (3600 record) there is 4 anomaly and I detect 2 anomaly in red cercel


#4

I don’t know about ECG data specifically, but I would say that including abnormal data in the training is generally a bad idea. HTM will learn the abnormal sequences more quickly than most other ML algorithms (in a single shot in many cases), and if those same abnormal patterns happen again later after training, they will not be considered anomalous any more.


#5

I just wanted to poke my head in to remind everyone that this is a hard, unsolved problem. Signal analysis has potential as an application of HTM, but I think we need to extract features from the signal (or feature windows) instead of sending in the raw signal. This is the same idea as using DL to extract features from video.

Also, recommended reading:


#6

Yes, if you want a sequence to be anomalous to the TM its best that it hasn’t seen it at all.

One other thought is that if some of the anomalies are point/spatial anomalies (short term spikes), the Anomaly Likelihood isn’t necessarily going to pick them up unless the anomaly scores are very low overall. It may be worth it to augment the system with a simple threshold or other rule of thumb specific to this application domain. The TM will find temporal anomalies very well, though the Anomaly Likelihood window size parameters can limit its ability to catch non temporal anomalies (is my understanding is basically right).


#7

Another possibility is to consider ecg as images and to encode images into sdr, then detect anomaly…


#8

@paul_lamb and @sheiser1 , yes maybe this why I can’t detect all tha anomalies because I 'm trainig with dataset contain normal and abnormal data
@thanh-binh.to yes I’m sure , I read a paper using the ecg as image but unfortunately I must using a numeric data