Hello everyone,
I am a newbie trying to use the HTM model to detect steps in my data. They come from an experiment and I would like the model to detect steps within the noise. To do so, I have been trying for a month to use the HotGym anomaly detection example with my data. I have been playing with the paremeters, mainly the encoder ones (‘n’ basically).
To try the model, I first fed the model with a trace with very marked and clear steps. I attach an image:
Note: The red graph represent in the y axis the position of my probe along time. The time is between 0 an 600 seconds. The x axis represent the number of points in my data (not the time).
The blue graph shows the anomally scores. Sometimes it seems to point out the steps correcty, but somestimes don’t… I have used the same model parameters as in the HotGym example, changing the date encoder for an scalar encoder for the time (because my data is taken with a frecuency of 120 Hz, during 600 seconds). These are the parameters with the changes:
> MODEL_PARAMS = \
> { 'aggregationInfo': { 'days': 0,
> 'fields': [],
> 'hours': 0,
> 'microseconds': 0,
> 'milliseconds': 0,
> 'minutes': 0,
> 'months': 0,
> 'seconds': 0,
> 'weeks': 0,
> 'years': 0},
> 'model': 'HTMPrediction',
> 'modelParams': { 'anomalyParams': { u'anomalyCacheRecords': None,
> u'autoDetectThreshold': None,
> u'autoDetectWaitRecords': None},
> 'clParams': { 'alpha': 0.01962508905154251,
> 'verbosity': 0,
> 'regionName': 'SDRClassifierRegion',
> 'steps': '1'},
> 'inferenceType': 'TemporalAnomaly',
> 'sensorParams': { 'encoders': { '_classifierInput': { 'classifierOnly': True,
> 'clipInput': True,
> 'fieldname': 'position',
> 'maxval': 7700,
> 'minval': 6600,
> 'n': 115,
> 'name': '_classifierInput',
> 'type': 'ScalarEncoder',
> 'w': 21},
> u'position': { 'clipInput': True,
> 'fieldname': 'position',
> 'maxval': 7700,
> 'minval': 6600,
> 'n': 500,
> 'name': 'position',
> 'type': 'ScalarEncoder',
> 'w': 21},
> u'time':{'clipInput': True,
> 'fieldname': 'time',
> 'maxval': 600,
> 'minval': 0.0,
> 'n' : 9000,
> 'name' : 'time',
> 'type' : 'ScalarEncoder',
> 'w': 21}},
> 'sensorAutoReset': None,
> 'verbosity': 0},
> 'spEnable': True,
> 'spParams': { 'columnCount': 2048,
> 'globalInhibition': 1,
> 'inputWidth': 0,
> 'boostStrength': 2.0,
> 'numActiveColumnsPerInhArea': 40,
> 'potentialPct': 0.8,
> 'seed': 1956,
> 'spVerbosity': 0,
> 'spatialImp': 'cpp',
> 'synPermActiveInc': 0.05,
> 'synPermConnected': 0.1,
> 'synPermInactiveDec': 0.08568228006654939},
> 'tmEnable': True,
> 'tmParams': { 'activationThreshold': 12,
> 'cellsPerColumn': 32,
> 'columnCount': 2048,
> 'globalDecay': 0.0,
> 'initialPerm': 0.21,
> 'inputWidth': 2048,
> 'maxAge': 0,
> 'maxSegmentsPerCell': 128,
> 'maxSynapsesPerSegment': 32,
> 'minThreshold': 10,
> 'newSynapseCount': 20,
> 'outputType': 'normal',
> 'pamLength': 1,
> 'permanenceDec': 0.1,
> 'permanenceInc': 0.1,
> 'seed': 1960,
> 'temporalImp': 'cpp',
> 'verbosity': 0},
> 'trainSPNetOnlyIfRequested': False},
> 'predictAheadTime': None,
> 'version': 1}
I do not know if there is a problem with this type of encoding for the ‘time’… (all the examples I have seen use date encoders) The values for ‘n’ are the best that I could find. I also have doubts about the encoder: ‘_classifierInput’: { ‘classifierOnly’: True. Why it is necessary to include this first encoder? I have read that this goes to the SDR Classifier. In this case, how affect that the value on ‘n’ is different from the ‘position’ encoder? I could just remove this SDR Classifier and this classifierInput encoder and the results would be the same, am I wrong?
Then, I tried the model with some real data… and this was the result:
As you can see, It fails in detecting an abrupt step (in the zoom). It is very strange that the peak in the anomaly score is in a place where no anomaly is occuring at all. It also fails in detecting the steps at the end of the trace. The parameters are the same as above but changing the values in the encoder: n = 9000 for both time and position encoder (I have tried with other values of ‘n’ but they give me an anomaly score very high almost everywhere in the trace). I also tuned the min and max values to the ones that correspond.
I am also trying to use swarm, but I am having problems with MySQL server. I have seen posts on this topic so I will keep trying. My main concern is if this type of model does not work at all with this type of data or in this type of problem (step-detection). Or maybe if I tune the parameters correcty I will get very good results. What do you think?
In conclusion:
- What other parameters may be key to change in order to get step-detection?
- Is there something important about the model that I am missing?
- Does anomaly detection work in this type of data?
Thank you very much!