I am trying to test the application of the htm.core model anomaly detector with epilleptic seizures. The fact is that after some seizures, the algorithm stops recognizing them as anomalous, so I conclude it is learning them as normal behaviour. From other posts of the forum, I guess the best option is to test my algorithm with normal data with no seizures and then disable learning and introduce data that contains seizures. How could I code it? Which is the method to disable learning in htm.core?
Also I have another question, more theoretical. If I train with a normal dataset with no seizures and then disable learning, isn’t it like other batch method ML algorithm? I thought that the main advantadge of HTM is that it is constantly learning and adjusting, but can we make that it does not learn anomalies after certain number of observations but still learns non-anomalous data?
Thank you in advance.
Check docs for TM and SP’s .compute() arguments, there-s a boolean to enable learning which in most examples is set True
Regarding philosophical issues it boils down to what is most valuable - biological orthodoxy or seizure detection.
What would be more valuable is seizure prediction, with a lead of at least few seconds
That is correct, it is only considered an anomaly the first time it happens.
You could run two instances in parallel, one with learning off, one with learning on.
If you are trying to capture the events leading up to the event you could delay one in time and use the anomaly value to modulate model parameters or toggle the learning switch.This all depends on what you need it to do.
@cezar_t 's got it.
Here’s how I control learning in my implementation:
The TM will learn any sequential patterns that repeat over time, so all inputs are anomalous at first and only become familiar once they repeat. I think this question comes down to the nature of the anomalies. If the anomalies are random and chaotic, the TM will never learn them because they are just noise. In this case there’s no real need to turn off learning for fear that the anomalies will be learned and no longer seen as anomalies. If however the anomalies aren’t just noise and constitute their own patterns, the TM will eventually come to recognize them and not see them as anomalous.
Since we don’t know going in which case we’re dealing with, I think it is safest to train with data known to be non-anomalous and then run the remaining data with learning off. The downside of this approach is that if new normal patterns emerge (non-anomalies) the TM will see them as anomalies since learning is off.
Thanks a lot for all the responses! All of them have been very useful and understandable.
So, from @sheiser1, I understand that I should train the model with non-anomalous data (with no seizures), and then turn off learning before introducing data with no seizures. So I wonder if, in order to achieve it, the procedure is the following one:
- Execute spatial pooling over the normal data with: sp.compute(normal_data, True, activeColumns)
- Execute temporal memory algorithm: tm.compute(activeColumns, learn=True)
- Execute spatial pooling over the data with seizures and learning disabled: sp.compute(sz_data, False, activeColumns)
- Execute temporal memory algorithm again (with learning off): tm.compute(activeColumns, learn=True)
I have tried disabling learning after showing the model some interictal samples, but now it does not detect any seizure, neither the first ones. Which could be the reason?
Could you show how you came to this conclusion?
I assume there are certain points in time you have labeled as seizures right?
Are you looking for jumps in anomaly score or likelihoods at those times but finding none?
If we could see the time plots of the raw data aligned with the corresponding anomaly scores and likelihoods that would help!
Is this after you learned seizures?
Here there are my results.
Here I use a synthetic dataset with 4 seizures (the preictal region is marked in yellow in the figures).
If I feed the model with the dataset and learning on, the anomalylikelihood jumps in the two first seizures but then does not jump in the following ones:
However, if I train the model with a chunk of interictal data and then turn off learning and introduce the same previous synthetic dataset with 4 seizures, the anomaly likelihood does not jump.
I have restarted kernel to run this experiment, so theoretically the model has not seen any seizure yet.
When you trained on the bit you wanted to trigger the anomaly signal you told it that this is normal and not to flag it.
Don’t do that.
Train it on what you do NOT want it to flag, then turn off training.
The anomaly signal is a flag that something is surprising; after it has trained on the sample it is not a surprise any more.
Yes, training should be only with non-seizure data (is that what ‘interictal’ means?).
Also the EEG signal data doesn’t look so periodic/cyclical to me. The TM is great at quickly learning sequential patterns as they appear, but if the signal is just randomly bobbing around that’s just noise from a sequence learning view.
When the anomaly likelihoods do jump on the first two seizures, it appears to be because the anomaly scores have dropped dramatically at those times – but if the normal (non-seizure) patterns are learned and seizures are anomalies, then those scores should be jumping up at the seizures not dropping down.
I wonder if some kind of signal smoothening would help uncover the sequential patterns. If the nature of the difference between non-seizure and seizure times is just volatility/variance and not sequential, I think HTM isn’t the best option. But I think it’s worth trying some smoothening or preprocessing before reaching that conclusion.
To diagnose this presence or absence of sequential patterns, I’d plot the accumulated anomaly score instead of the raw anomaly scores. The raw anomaly scores are volatile and thus hard to interpret visually, but the accumulated anomaly score only increases - the question is what rate of increase.
When the accumulated score is increasing at a high rate (closer to a 45 degree angle), that means most anomaly scores are high and the sequential predictability is low. However when the accumulated score is increasing at a low rate (closer to flat-lining) it means that predictability has emerged. The slope will always start out at 45 degree since the model hasn’t learned anything yet, but it may level off. Looking at this slope of the accumulated anomaly scores helps me get a clearer picture of the predictability – but it’ll need its own plot separate from the likelihood – since its not bound within 0 to 1 as the raw scores are.
If that would be the case shouldn’t the anomaly be signaled continuously? Anomaly I think is a significant difference between predicted next value and actual value
I think I explained bad myself. I have actually trained on the bits I do not want to flag (the interictal datachunk) and then I have turned off learning and introduced the signal with seizures. What I do not understand is why is the model not detecting seizures as anomalous if theoretically it has never seen them.
Is it possible that the model parameters must be tuned when learning is turned off? Which would be the sense?
I have tried the same procedure with another dataset, and when disabling learning and introducing a dataset with a seizure the anomaly scores seem to increase when detecting the seizure but the anomaly likelihood does not:
I don’t think so. First of all, there is no gradient descent in HTM. The rough equivalent to “ANN update of weights” in HTM is the update of permanence values and this happens in real-time, as in every time it ingests an input assuming HTM learning is on. SGD has a similarity with HTM in the sense that it updates its permanence (weights in ANN) for every input it ingests.
Yes it is constantly learning and adjusting assuming that HTM learning is on. Anomalies and non-anomalies mean nothing to TP but they are interpretation of TP’s output. AFAIK an anomaly is calculated outside TP computation and it’s a probabilistic score.
Yep, more precisely a high anomaly score is simply a result of differences between predicted input and actual input. The two SDRs have low overlap.
Once TP learns how to predict an event that in the past was anomalous (== unpredicted) then it is no longer an anomaly.
Temporal memory has no concept nor way on how to distinguish seizures from other (anomalous or not) data. Its sole purpose is to predict the next input.
If you want to anticipate seizures you-re better off training an online SDR Classifier over the temporal pooler (aka temporal memory) activation states and tell it which states preceded a seizure and which not.
Yes it requires some supervision but it still can be done online. And allegedly TM is significantly more sample efficient than a NN based predictor.
If it’s pure noise yes, the anomaly score would be 1.0 at every time step.
That’s not the case here, though it also doesn’t look like the anomaly scores are settling down to be consistently lower than at the beginning - which would happen if clear patterns had emerged.
I do think this is easier to see visually with the accumulated anomaly score than raw anomaly scores, since the accumulated are monotonically increasing and we can just observe the slope of increase.
But if you’re using the anomaly likelihood that would stay low in the case of constant 1.0 anomaly scores - since the likelihood measures recent change in anomaly scores.
The anomaly scores do increase starting after time step 20000 yes, but it seems not enough to differentiate significantly from the prior anomaly scores. This difference is what drives up the anomaly likelihood.
The root of the problem IMO is that the anomaly scores aren’t settling down to consistently low values during the interictal (non-seizure) times. The model won’t have a robust sense of what’s anomalous if it doesn’t have a robust sense of what’s normal.
To try and address this, I’d first try reducing the sampling rate (or aggregating the data down, something to have fewer data points per second). It could be the case that the sequential patterns exist, but they occur on a much slower time scale than the sampling rate. For instance, let’s say the patterns repeat every 3-5 seconds but you’re sampling at 100 points per second – it’ll be very hard for the model to find these slow patterns with this fast sampling rate, you’ll loose the forest for the trees.
It could also just be the case that there’s not much cyclical nature to the data and temporal learning isn’t the best way to detect seizures - but I’d try down sampling and data smoothening before assuming so.
I dont exactly know how you are generating these dataset but my best guess is that by inspection your interictal dataset is not that far with the seizures. Look at their amplitudes they look like they have common values, am I the only one seeing it? If I’m correct about this then the TP is doing what it’s suppose to do, it guessed the seizures because it already saw it in your training data. Therefore it wasn’t interpreted as an anomaly.
It might be useful to check your interictal data and seizure data by ensuring you are correctly generating the right variations to describe a realistic seizure.