Does Anomaly detection work on detecting Steps in noisy data? My results do not make sense

Hello everyone,

I am a newbie trying to use the HTM model to detect steps in my data. They come from an experiment and I would like the model to detect steps within the noise. To do so, I have been trying for a month to use the HotGym anomaly detection example with my data. I have been playing with the paremeters, mainly the encoder ones (‘n’ basically).

To try the model, I first fed the model with a trace with very marked and clear steps. I attach an image:

Note: The red graph represent in the y axis the position of my probe along time. The time is between 0 an 600 seconds. The x axis represent the number of points in my data (not the time).
The blue graph shows the anomally scores. Sometimes it seems to point out the steps correcty, but somestimes don’t… I have used the same model parameters as in the HotGym example, changing the date encoder for an scalar encoder for the time (because my data is taken with a frecuency of 120 Hz, during 600 seconds). These are the parameters with the changes:

> { 'aggregationInfo': { 'days': 0,
>                        'fields': [],
>                        'hours': 0,
>                        'microseconds': 0,
>                        'milliseconds': 0,
>                        'minutes': 0,
>                        'months': 0,
>                        'seconds': 0,
>                        'weeks': 0,
>                        'years': 0},
>   'model': 'HTMPrediction',
>   'modelParams': { 'anomalyParams': { u'anomalyCacheRecords': None,
>                                       u'autoDetectThreshold': None,
>                                       u'autoDetectWaitRecords': None},
>                    'clParams': { 'alpha': 0.01962508905154251,
>                                  'verbosity': 0,
>                                  'regionName': 'SDRClassifierRegion',
>                                  'steps': '1'},
>                    'inferenceType': 'TemporalAnomaly',
>                    'sensorParams': { 'encoders': { '_classifierInput': { 'classifierOnly': True,
>                                                                          'clipInput': True,
>                                                                          'fieldname': 'position',
>                                                                          'maxval': 7700,
>                                                                          'minval': 6600,
>                                                                          'n': 115,
>                                                                          'name': '_classifierInput',
>                                                                          'type': 'ScalarEncoder',
>                                                                          'w': 21},
>                                                    u'position': { 'clipInput': True,
>                                                                                'fieldname': 'position',
>                                                                                'maxval': 7700,
>                                                                                'minval': 6600,
>                                                                                'n': 500,
>                                                                                'name': 'position',
>                                                                                'type': 'ScalarEncoder',
>                                                                                'w': 21},
>                                                      u'time':{'clipInput': True,
>                                                               'fieldname': 'time',
>                                                               'maxval': 600,
>                                                               'minval': 0.0,
>                                                               'n' : 9000,
>                                                               'name' : 'time',
>                                                               'type' : 'ScalarEncoder',
>                                                               'w': 21}},
>                                     'sensorAutoReset': None,
>                                      'verbosity': 0},
>                    'spEnable': True,
>                    'spParams': { 'columnCount': 2048,
>                                  'globalInhibition': 1,
>                                  'inputWidth': 0,
>                                  'boostStrength': 2.0,
>                                  'numActiveColumnsPerInhArea': 40,
>                                  'potentialPct': 0.8,
>                                  'seed': 1956,
>                                  'spVerbosity': 0,
>                                  'spatialImp': 'cpp',
>                                  'synPermActiveInc': 0.05,
>                                  'synPermConnected': 0.1,
>                                  'synPermInactiveDec': 0.08568228006654939},
>                    'tmEnable': True,
>                    'tmParams': { 'activationThreshold': 12,
>                                  'cellsPerColumn': 32,
>                                  'columnCount': 2048,
>                                  'globalDecay': 0.0,
>                                  'initialPerm': 0.21,
>                                  'inputWidth': 2048,
>                                  'maxAge': 0,
>                                  'maxSegmentsPerCell': 128,
>                                  'maxSynapsesPerSegment': 32,
>                                  'minThreshold': 10,
>                                  'newSynapseCount': 20,
>                                  'outputType': 'normal',
>                                  'pamLength': 1,
>                                  'permanenceDec': 0.1,
>                                  'permanenceInc': 0.1,
>                                  'seed': 1960,
>                                  'temporalImp': 'cpp',
>                                  'verbosity': 0},
>                    'trainSPNetOnlyIfRequested': False},
>   'predictAheadTime': None,
>   'version': 1}

I do not know if there is a problem with this type of encoding for the ‘time’… (all the examples I have seen use date encoders) The values for ‘n’ are the best that I could find. I also have doubts about the encoder: ‘_classifierInput’: { ‘classifierOnly’: True. Why it is necessary to include this first encoder? I have read that this goes to the SDR Classifier. In this case, how affect that the value on ‘n’ is different from the ‘position’ encoder? I could just remove this SDR Classifier and this classifierInput encoder and the results would be the same, am I wrong?

Then, I tried the model with some real data… and this was the result:

As you can see, It fails in detecting an abrupt step (in the zoom). It is very strange that the peak in the anomaly score is in a place where no anomaly is occuring at all. It also fails in detecting the steps at the end of the trace. The parameters are the same as above but changing the values in the encoder: n = 9000 for both time and position encoder (I have tried with other values of ‘n’ but they give me an anomaly score very high almost everywhere in the trace). I also tuned the min and max values to the ones that correspond.

I am also trying to use swarm, but I am having problems with MySQL server. I have seen posts on this topic so I will keep trying. My main concern is if this type of model does not work at all with this type of data or in this type of problem (step-detection). Or maybe if I tune the parameters correcty I will get very good results. What do you think?

In conclusion:

  • What other parameters may be key to change in order to get step-detection?
  • Is there something important about the model that I am missing?
  • Does anomaly detection work in this type of data?

Thank you very much! :blush:


Sorry I can’t give you anything concrete, but I can share my intuition.

The main thing that stands out to me is that you have described the problem as detecting steps within noise. Your observation that you had problems with very high anomaly scores almost everywhere also supports the possibility that the data is in fact noise. If that is the case, then HTM is probably not the best fit for this problem.

HTM is really designed for data which has temporal patterns / periodicity which it can learn to model. When we speak of HTM’s “noise tolerance”, we mean noise that is masking the underlying temporal patterns (i.e. those patterns must still be present). This use case seems to be essentially a smoothing and edge-detection type of problem, rather than a temporal pattern recognition one.

That said, it may be possible to apply HTM to this problem with the right encoding strategy. For example, you might somehow encode a general heading/velocity from the data, so that the “straight” runs in the graph become predictable and the sharp turns stand out. With just a scalar encoder, I think HTM is going to be focused on the non-periodicity of the data in general, and miss the forest for the trees so to speak when it comes to the steps in the data.


I think this type problem (“step-detection”) is not best fit for HTM, because the change in behavior is spatial – since the distribution has basically shifted up/down.
HTM is great for recognizing temporal changes in behavior. Times when the sequential traits of the data have changed, even if distribution seems stationary (no change in mean/variance).

That said I’m not sure HTM couldn’t also work in your case, but those encoder values would definitely need to change.

I’m pretty sure this ‘time’ encoding is invalid. If you’re treating time as a time-step (always incrementing 1, 2, 3, …10000) it’s just adding noise – so I’d drop it completely. It only makes sense to encode things that can potentially show periodicity. The datetime encoder accomplishes this by representing aspects of the calendar that repeat (like hour of day, day of week and time of year).

When playing with the encodings ‘n’ should always be 10-20x ‘w’. I’d recommend leaving ‘n’ constant within that range, and playing with the ‘minval’/‘maxval’ instead.

Also the min/max values of 6600 and 7700 make sense for your first data set but not the second – since its values are much higher. Since you have clipInput on, the encoder will see every value below 6600 as 6600 and every value above 7700 as 7700. So in your “real data” case the system is getting the exact same ‘position’ encoding vector at each time step – which is basically noise.

Lastly I’d highly recommend calculating the Anomaly Likelihood after the raw anomaly score, as it normalizes for noise in the data. Though first thing is definitely fixing the encoders.

1 Like

I suspect that an HTM approach could be made to work for this, using spatial pooling only, but that it’s probably overkill… like WAY overkill.

I’d train two separate spatial poolers (no temporal/proximal connections)… one for “step” data and the other for “non-step” (stride?) data. Let them specialize into those patterns, then freeze the learning. My suspicion is that when “step” data is present, the step pool would have a higher/more active level of firing, and vice versa, as determined by summing the activation values of all the active firing columns, which would represent the overall pools’ confidence levels.

Another approach might be to (again, only using a spatial pooler) feed training data through a couple epochs, allowing it to learn the patterns, then on the final pass, when you know the input data is step/not-step, save the resulting SDRs into step/not-step buckets. For validation data, disable learning, compare current sample’s output SDR to either bucket by applying either an overlap or hamming distance score. YMMV.

A further area for exploration might be to see if OR’ing all the patterns in the step bucket and not-step buckets produces two distinct enough patterns, that you might simply be able to then use for quick comparison for classification.

Better approaches would be statistical approaches such as Kalman filters, Gaussian distribution over N timesteps, and some logic built on top of that to determine the width of 1 stddev, and how far beyond that counts as a “step”, then how many frames beyond that to count as part of a single/same step.

Dependent variables would be your sampling rate (samples-per-second), overall loop time for a single frame, etc… again, I think if you had labelled data (what is a step, what isn’t), you could teach a spatial pooler to recognize a “step”, but I personally wouldn’t advise it.

Interesting experiment though! :slight_smile:

Final edit:

For your encoding input, I’d concatenate timesteps N and N-1 together, to cheaply embody some of the temporal aspects of your data. Maybe even N, N-1, and N-2.


Thank you very much for sharing your intuition!

You are right, this problem is not about temporal pattern recognition, so probably HTM is not the best approach.

Ok, so maybe with the right encoder the model can perfom well. The data can be not only ‘straight’ when there is no step, it can have a drift up or down, so I do not know how it can affect to the general performance.

1 Like

Thank you! Very clarifying.

What happen if I do not include the encoder for the time? I tried to include only the enconder for the position and the results did not seem to change very much.

So I should keep it about 400?

Oh, I used the correct values for the min and max in the seconde example (min of 500 and max of 1700 if i remember well)! I just only posted the values for the first one.

Ok! I calculated de Anomaly Likelihood but the results were not better. Too many peaks… so yes, I will focus on changing the enconders. @Paul_Lamb also suggested that the scalar encoder is not the best option.

1 Like

Whatever encoder you are using, without causing enough disturbance in the network (specifically, when there is a deviation in the signal trend), no anomalies will be detected. Your data does not seem to have clear trend, it is hard to identify anomalies.


You may not see an improvement yet, it could still come from optimizing the ‘position’ encoder. But a ‘time’ value which only increments just adds noise – thus obstructing any improvement.

Sure, though I agree ScalarEncoder is likely not optimal. With RandomDistributedScalarEncoder you specify a resolution instead of min/max. This means that any 2 values separated by more than this resolution with have non-overlapping encodings.

Even before the switch to RDSE I’d try some data preprocessing – like converting the raw values to %change from the prior time step. I think this should help to expose whether there are sequential patterns in the data. Its hard to tell with the current raw values, since the plot’s y-axis has such a wide range. It followed that the encoder had the same wide range, min/max of 500/1700. This covers all the values, but it lacks precision since most sequences are contained within a much smaller range it looks like. I think the %change could help alleviate this. I’d get all the %changes and plot them in a histogram, then pick new min/max values from there are rerun.


Sure, though I agree ScalarEncoder is likely not optimal. With RandomDistributedScalarEncoder you specify a resolution instead of min/max. This means that any 2 values separated by more than this resolution with have non-overlapping encodings

Two values separated by more than the ‘resolution’ can have overlapping bits, assuming the encoding is performed using RDSE. One may control the overlapping percentage by playing with the ‘resolution’, ‘n’, and ‘w’ parameters

1 Like