Hi, I’m new to the forum. I’m interested in detecting anomalies in a depth-series data (depth-stamped instead of time-stamped). It seems like it should be very possible, but I’m not sure where to begin, or if it has been attempted previously. I’m hoping someone can provide me some direction.
I am confused by what you might mean by depth stamped.
A sequence is a sequence; time just happens to be one axis label in a series.
Thanks for your reply. I meant instead of having value vs time, I have value vs depth. I was able to run the hotgym example (temperature vs time). I wonder how can I do temperature vs depth instead?
Conceptually it should work, I think (because as you mentioned - a sequence is a sequence). I just need to figure out where to implement it in code. I hope I’m making sense.
Call the depth time and feed it right in.
There will still need to be some temporal pattern to the data for HTM to apply. It operates on a stream of data, predicting what will happen next and/or identifying anomalies in the stream.
To use HTM out of the box for this problem, you will probably want to input the data in some particular order (for example starting from the measurement with the smallest depth and progressing to the one with the largest depth for each location).
If you aren’t squeamish about experimenting with the core algorithms, you could try to replicate some of the TBT experiments. For example, creating a population of cells (grid cell modules, etc) which encode the depth, and modify the cells in the TM layer to grow distal connections with that other population of cells rather than with other cells in the same TM layer. Given that this is in the “Getting Started” section, that is probably not a great place to start until you are much more familiar with HTM.
Would you show a snippet of this temp vs depth data you have? The key isn’t the timestamp specifically, rather the presence of sequential patterns.
Thank you all for your replies!
@Paul_Lamb My data is ordered sequentially (in increasing depth). I’m still trying to get familiarized with HTM. I’ll keep a note of the TBT experiments you mentioned.
@sheiser1 Here are some examples of the data I have. I believe all of these have the sequential patterns you mentioned? Eventually, I’m interested in detecting anomalies in multiple variables (I’m following the discussion here)
The most promising-looking one to me is 9th from the bottom (y-axis range 0 to 140) – given its clear periodicity and relative stationarity.
There also seems to be a correlation between the 6th &7th from the top (ranges -600 to 200 and 0 to 800), so I imagine they’d show anomalies at the same times.
A lot of the others are hard to tell from looking, since most of the movement is within a small range relative to the total y-axes range – given the spikes.
Do you have labels for when the real anomalies happened, in order to evaluate HTM-raised anomalies? And does each metric have separate anomaly times or are they all part of a larger system?
They are all part of a larger system.
I don’t have labeled data yet, but I can probably generate some synthetic ones. The anomalies I’m interested in include the following:
- global anomalies (values beyond the normal expected range of the datapoints)
- local anomalies (values that deviate significantly from neighboring datapoints)
- group anomalies (values from multiple variables that, as a whole, deviate from the normal)
- trend anomalies (values that deviate from the expected trend - for example, if variables A & B have historically shown a positive correlation, a point where A is increasing while B is decreasing would be considered an anomaly)
Would HTM be able to detect such anomalies? If so, would there be certain criteria that the dataset must satisfy for the algorithm to detect the anomalies successfully?
Generally yes, but it depends on the noise levels and pattern complexities present in each variable. The anomalies have to do with predictability and change in it – so it may not notice say an inverse relationship between 2 variables being replaced with a direct relationship. It depends how they were behaving sequentially up until that point, so it could go either way.
The kinds of anomalies HTM is great at picking up are temporal – changes in the sequential structure of the data. So single big spikes for instance are likely better caught by simple thresholding. Since HTM relies on periodicity in the data a trending series without periodicity won’t be well modeled. It makes sense to me have other detectors in place for those kinds of anomalies.