Anomaly Detection - Poor results - Build issues or Tuning issues on Real Data

dk25094 · June 7, 2020, 7:13pm

Dear HTM Anomaly Detection community,
I was wondering if anyone can shed some light on what I am doing wrong. I have been pulling my hair out for last week on this.

I am getting poor results and not sure if it is due to tuning parameters or the library versions.

BACKGROUND:

PROBLEM 1: git hub build version issues:
I started with htmengine which depends on an old version of nupic 0.5.7. I was not able to install the more preferred version of nupic 1.0.7.
So I have nta.utils 0.0.0, htmengine 0.0.0.

PROBLEM 2: results dont make sense.

I saw Matts video on the hot gym and results totally make sense, but on my data, the results dont make sense and was wondering if it is due to the libraries that I used to build the tool, or tunable params below:

Here is an screenshot of where results dont make sense:

The predicted values are very close to actual values, yet the anomaly likelihood goes high, this contradicts Matts hot gym video, where the anomaly likelihood ( and score ) goes high when the models prediction and actual have a large differential.

Any clarifications on what is going on would be highly appreciated. Is it the build itself, or my tunable parameters or something else that I am missing ?

thanks

My parameters are:

{ ‘aggregationInfo’: { ‘days’: 0,
‘fields’: [],
‘hours’: 0,
‘microseconds’: 0,
‘milliseconds’: 0,
‘minutes’: 0,
‘months’: 0,
‘seconds’: 0,
‘weeks’: 0,
‘years’: 0},
‘model’: ‘CLA’,
‘modelParams’: { ‘anomalyParams’: { u’anomalyCacheRecords’: None,
u’autoDetectThreshold’: None,
u’autoDetectWaitRecords’: None},
‘clParams’: { ‘alpha’: 0.014695645742164247,
‘regionName’: ‘SDRClassifierRegion’,
‘steps’: ‘1’,
‘verbosity’: 0},
‘inferenceType’: ‘TemporalAnomaly’,
‘sensorParams’: { ‘encoders’: { u’network_latency’: { ‘clipInput’: True,
‘fieldname’: ‘network_latency’,
‘maxval’: 53.0,
‘minval’: 0.0,
‘n’: 102,
‘name’: ‘network_latency’,
‘type’: ‘ScalarEncoder’,
‘w’: 21},
u’timestamp_dayOfWeek’: None,
u’timestamp_timeOfDay’: { ‘fieldname’: ‘timestamp’,
‘name’: ‘timestamp’,
‘timeOfDay’: ( 21,
1.3311854462097823),
‘type’: ‘DateEncoder’},
u’timestamp_weekend’: None},
‘sensorAutoReset’: None,
‘verbosity’: 0},
‘spEnable’: True,
‘spParams’: { ‘columnCount’: 2048,
‘globalInhibition’: 1,
‘inputWidth’: 0,
‘maxBoost’: 1.0,
‘numActiveColumnsPerInhArea’: 40,
‘potentialPct’: 0.8,
‘seed’: 1956,
‘spVerbosity’: 0,
‘spatialImp’: ‘cpp’,
‘synPermActiveInc’: 0.05,
‘synPermConnected’: 0.1,
‘synPermInactiveDec’: 0.1},
‘tpEnable’: True,
‘tpParams’: { ‘activationThreshold’: 13,
‘cellsPerColumn’: 32,
‘columnCount’: 2048,
‘globalDecay’: 0.0,
‘initialPerm’: 0.21,
‘inputWidth’: 2048,
‘maxAge’: 0,
‘maxSegmentsPerCell’: 128,
‘maxSynapsesPerSegment’: 32,
‘minThreshold’: 9,
‘newSynapseCount’: 20,
‘outputType’: ‘normal’,
‘pamLength’: 2,
‘permanenceDec’: 0.1,
‘permanenceInc’: 0.1,
‘seed’: 1960,
‘temporalImp’: ‘cpp’,
‘verbosity’: 0},
‘trainSPNetOnlyIfRequested’: False},
‘predictAheadTime’: None,
‘version’: 1}

sheiser1 · June 7, 2020, 8:00pm

Hi @dk25094, welcome!

This is what jumped out at me.
All recent versions I’ve seen use ‘HTMPrediction’ instead of ‘CLA’.

In case you haven’t already I’d have a look at the model_params file for hotgym anomaly:

github.com

numenta/nupic-legacy/blob/master/examples/opf/clients/hotgym/anomaly/model_params.py

# ----------------------------------------------------------------------
# Numenta Platform for Intelligent Computing (NuPIC)
# Copyright (C) 2013, Numenta, Inc.  Unless you have an agreement
# with Numenta, Inc., for a separate license for this software code, the
# following terms and conditions apply:
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero Public License version 3 as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
# See the GNU Affero Public License for more details.
#
# You should have received a copy of the GNU Affero Public License
# along with this program.  If not, see http://www.gnu.org/licenses.
#
# http://numenta.org/licenses/
# ----------------------------------------------------------------------

This file has been truncated. show original

The Classifier is not involved here so it won’t yield predicted values, just anomaly score and likelihoods. You can still use a Classifier and it’ll have no effect on the anomaly scores, but I wouldn’t assume that the system is predicting well just because the Classifier-‘predicted’ and ‘actual’ values are close. I’m not totally familiar with the Classifier, but I feel I’ve seen this before. It may be that when the TM generates no predictions at all the Classifier defaults to the last observed value, which could make them look close often.

I’m not sure the implications of your version issue, but my instinct is try to your setup on some well-understood data first. This could be the hotgym data itself, since we know what it’ll look like if the system is acting normally.

Topic		Replies	Views
High anomalylikelihood values for hot gym anomaly example NuPIC anomaly-detection	11	2053	May 15, 2017
Anomaly detection Newbie NuPIC	3	817	October 25, 2017
Not Getting Anomaly Results NuPIC anomaly-detection	3	994	July 9, 2016
Why am I seeing lot of false positives? NuPIC	12	2489	June 22, 2016
The meaning of the parameters about modelParams NuPIC	2	771	July 11, 2018

Anomaly Detection - Poor results - Build issues or Tuning issues on Real Data

Related topics