Advanced settings for model parameters in HTM Engine

hello

I’m using the HTM Engine in order to detect anomalies in a stream data based in the load of our systems (in a first moment, only number of requests per minute on our web sites which it has a clear pattern).

We’re having good results, but also a lot of “false positives”, mainly Mondays after a period of soft load, I think, the HTM Engine can’t “understand” the transition between weekend to monday.

I thought maybe I can change some model parameters, besides min and max, something like: columnCount or cellPerColumn.

So the question is: How can I do an advanced settings for model parameters? Is it possible?

I’m not a developer, so this kind of setup is very helpfull for me.

Thanks in advance (and excuse my english).

Juan

1 Like

Hi Juan,

my experience has been that transitions from weekends to weekdays should work, and we rely on it in the HTM-for-Stocks app, etc.

Some questions come immediately to mind: 1. are you aggregating your data, and what is the aggregation period, if so? 2. How many samples have you streamed into HTM Engine thus far, while still experiencing many false positives?

On the other front, we recently merged a PR into HTM Engine that enables the full set of OPF (nupic’s online prediction framework) parameters to be passed in when creating a model: monitorMetric takes this parameter https://github.com/numenta/numenta-apps/blob/d9d7c7aec3149059a6ea436ec4e3166521d264f6/htmengine/htmengine/adapters/datasource/custom/init.py#L196-L198 that enables you to specify the full model configuration via the completeModelParams property inside modelSpec. The schema for completeModelParams is described here: https://github.com/numenta/numenta-apps/blob/d9d7c7aec3149059a6ea436ec4e3166521d264f6/htmengine/htmengine/adapters/datasource/model_spec_schema.json#L36. The tests offer a number of examples.

Hope this helps.

Best,
Vitaly

P.S. I didn’t notice any issues with your English

Juan,

Additionally, have you tried HTM Studio - http://numenta.com/htm-studio/? You mentioned that you’re not a developer, so the HTM Studio app might be easier for you to experiment with anomaly detection without any coding at all.

1 Like

Hello

Thanks a lot Vitaly, I’ll test you say as solve a new issue. My nupic installation suddenly failed with error: AssertionError(‘Expected numeric anomaly score with status=0, but got: 1.0’

After that, I’m getting crazy with the reinstallation.

Thanks again for your help

P.S.: I’m waiting for a 64bit labtop ir order to play with HTM Studio

Hi,

I’ve coded a new release the script create_cpu_percent_model.py with the full set of OPF with default values. So it’s very helpfull for me.

#!/usr/bin/env python
# ----------------------------------------------------------------------
# Numenta Platform for Intelligent Computing (NuPIC)
# Copyright (C) 2015, Numenta, Inc.  Unless you have purchased from
# Numenta, Inc. a separate commercial license for this software code, the
# following terms and conditions apply:
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License version 3 as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
# See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see http://www.gnu.org/licenses.
#
# http://numenta.org/licenses/
# ----------------------------------------------------------------------

"""Create the cpu percent model.  See also send_cpu.py and README.md."""

from htmengine.adapters.datasource import createDatasourceAdapter



modelSpec = {
   "datasource": "custom",
    "metricSpec": {
    "metric": "cpu_percent"
  },
  "modelParams": {
    "min": 0,  
    "max": 100 
  },
  "completeModelParams" : {
    "modelConfig" :  {         
			"aggregationInfo": {
				"seconds": 0,
				"fields": [],
				"months": 0,
				"days": 0,
				"years": 0,
				"hours": 0,
				"microseconds": 0,
				"weeks": 0,
				"minutes": 0,
				"milliseconds": 0
			},
			"model": "CLA",
			"version": 1,
			"predictAheadTime": "",
			"modelParams": {
				"sensorParams": {
					"verbosity": 0,
					"encoders": {
						"c0_dayOfWeek": "",
						"c0_timeOfDay": {
							"type": "DateEncoder",
							"timeOfDay": [21, 9.49122334747737],
							"fieldname": "c0",
							"name": "c0"
						},
						"c1": {
							"fieldname": "c1",
							"seed": 42,
							"resolution": 0.8771929824561403,
							"name": "c1",
							"type": "RandomDistributedScalarEncoder"
						},
						"c0_weekend": ""
					},
					"sensorAutoReset": ""
				},
				"anomalyParams": {
					"anomalyCacheRecords": "",
					"autoDetectThreshold": "",
					"autoDetectWaitRecords": 5030
				},
				"spParams": {
					"columnCount": 2048,
					"synPermInactiveDec": 0.0005,
					"maxBoost": 1,
					"spatialImp": "cpp",
					"inputWidth": 0,
					"spVerbosity": 0,
					"synPermConnected": 0.1,
					"synPermActiveInc": 0.0015,
					"seed": 1956,
					"numActiveColumnsPerInhArea": 40,
					"globalInhibition": 1,
					"potentialPct": 0.8
				},
				"trainSPNetOnlyIfRequested": "false",
				"clParams": {
					"alpha": 0.035828933612158,
					"regionName": "CLAClassifierRegion",
					"steps": "1",
					"clVerbosity": 0
				},
				"tpParams": {
					"columnCount": 2048,
					"activationThreshold": 13,
					"pamLength": 3,
					"cellsPerColumn": 32,
					"permanenceInc": 0.1,
					"minThreshold": 10,
					"verbosity": 0,
					"maxSynapsesPerSegment": 32,
					"outputType": "normal",
					"globalDecay": 0,
					"initialPerm": 0.21,
					"permanenceDec": 0.1,
					"seed": 1960,
					"maxAge": 0,
					"newSynapseCount": 20,
					"maxSegmentsPerCell": 128,
					"temporalImp": "cpp",
					"inputWidth": 2048
				},
				"clEnable": "false",
				"spEnable": "true",
				"inferenceType": "TemporalAnomaly",
				"tpEnable": "true"
			}
	},
	
	"inferenceArgs" :  {         
		"predictionSteps": [1],
		"predictedField": "c1",
		"inputPredictedField": "auto"
	},
	
	"timestampFieldName" :  "c0",  
	"valueFieldName" :  "c1"       
	
	}
 }


adapter = createDatasourceAdapter(modelSpec["datasource"])
modelId = adapter.monitorMetric(modelSpec)

print "Model", modelId, "created..."

Feel free to use or to update git .

Hi,

Here I’m again dealing with my initial question. :slight_smile:

I created a model using a script with the complete model params schema, and I can see it into the metric table:

{
  "completeModelParams": {
    "valueFieldName": "c1", 
    "modelConfig": {
      "aggregationInfo": {
        "hours": 0, 
        "microseconds": 0, 
        "seconds": 0, 
        "fields": [], 
        "weeks": 0, 
        "months": 0, 
        "minutes": 0, 
        "days": 0, 
        "milliseconds": 0, 
        "years": 0
      }, 
      "model": "CLA", 
      "version": 1, 
      "predictAheadTime": "", 
      "modelParams": {
        "sensorParams": {
          "verbosity": 0, 
          "encoders": {
            "c0_dayOfWeek": {
              "dayOfWeek": [
                21, 
                5.01
              ], 
              "fieldname": "c0", 
              "type": "DateEncoder", 
              "name": "c0"
            }, 
            "c0_timeOfDay": {
              "fieldname": "c0", 
              "timeOfDay": [
                512, 
                9.49122334747737
              ], 
              "type": "DateEncoder", 
              "name": "c0"
            }, 
            "c1": {
              "type": "RandomDistributedScalarEncoder", 
              "seed": 42, 
              "fieldname": "c1", 
              "name": "c1", 
              "numBuckets": 130.0
            }, 
            "c0_weekend": ""
          }, 
          "sensorAutoReset": ""
        }, 
        "anomalyParams": {
          "anomalyCacheRecords": "", 
          "autoDetectThreshold": "", 
          "autoDetectWaitRecords": 5030
        }, 
        "clEnable": "false", 
        "spParams": {
          "columnCount": 102048, 
          "synPermInactiveDec": 0.0005, 
          "maxBoost": 1, 
          "spatialImp": "cpp", 
          "synPermConnected": 0.1, 
          "seed": 1956, 
          "numActiveColumnsPerInhArea": 40, 
          "globalInhibition": 1, 
          "inputWidth": 0, 
          "spVerbosity": 0, 
          "synPermActiveInc": 0.0015, 
          "potentialPct": 0.8
        }, 
        "trainSPNetOnlyIfRequested": "false", 
        "tpEnable": "true", 
        "clParams": {
          "alpha": 0.035828933612158, 
          "clVerbosity": 0, 
          "steps": "1", 
          "regionName": "CLAClassifierRegion"
        }, 
        "inferenceType": "TemporalAnomaly", 
        "spEnable": "true", 
        "tpParams": {
          "columnCount": 2048, 
          "activationThreshold": 13, 
          "pamLength": 3, 
          "cellsPerColumn": 32, 
          "permanenceInc": 0.1, 
          "minThreshold": 10, 
          "verbosity": 0, 
          "maxSynapsesPerSegment": 32, 
          "outputType": "normal", 
          "globalDecay": 0, 
          "initialPerm": 0.21, 
          "permanenceDec": 0.1, 
          "seed": 1960, 
          "maxAge": 0, 
          "newSynapseCount": 20, 
          "maxSegmentsPerCell": 128, 
          "temporalImp": "cpp", 
          "inputWidth": 2048
        }
      }
    }, 
    "timestampFieldName": "c0"
  }, 
  "datasource": "custom", 
  "metricSpec": {
    "metric": "modelo2"
  }
}

but, in the same table, the json values into the field model_params are differents to the previously defined model.

{"anomalyLikelihoodParams": null, 
 "inferenceArgs": {
	"predictionSteps": [1], 
	"predictedField": "c1", 
	"inputPredictedField": "auto"}, 
	"modelConfig": {
		"aggregationInfo": {
			"seconds": 0, 
			"fields": [], 
			"months": 0, 
			"days": 0, 
			"years": 0, 
			"hours": 0, 
			"microseconds": 0, 
			"weeks": 0, 
			"minutes": 0, 
			"milliseconds": 0
			}, 
			
		"model": "CLA", 
		"version": 1, 
		"predictAheadTime": null, 
		
		"modelParams": {
			"sensorParams": {
				"verbosity": 0, 
				"encoders": {
						"c0_dayOfWeek": null, 
						"c0_timeOfDay": {
							"fieldname": "c0", 
							"timeOfDay": [21, 9.49122334747737], 
							"type": "DateEncoder", 
							"name": "c0"}, 
						"c1": {
							"resolution": 160.3736842105263, 
							"seed": 42, 
							"fieldname": "c1", 
							"name": "c1", 
							"type": "RandomDistributedScalarEncoder"},
						"c0_weekend": null}, 
				"sensorAutoReset": null}, 
				"clEnable": false, 
				
				"spParams": {
					"columnCount": 2048, 
					"spVerbosity": 0, 
					"maxBoost": 1.0, 
					"spatialImp": "cpp", 
					"inputWidth": 0, 
					"synPermInactiveDec": 0.0005, 
					"synPermConnected": 0.1, 
					"synPermActiveInc": 0.0015, 
					"seed": 1956, 
					"numActiveColumnsPerInhArea": 40, 
					"globalInhibition": 1, 
					"potentialPct": 0.8}, 
				"trainSPNetOnlyIfRequested": false, 
				"clParams": {
					"alpha": 0.035828933612158, 
					"clVerbosity": 0, 
					"steps": "1", 
					"regionName": "CLAClassifierRegion"}, 
				"tpParams": {
					"columnCount": 2048, 
					"activationThreshold": 13, 
					"pamLength": 3, 
					"cellsPerColumn": 32, 
					"permanenceInc": 0.1, 
					"minThreshold": 10, 
					"verbosity": 0, 
					"maxSynapsesPerSegment": 32, 
					"outputType": "normal", 
					"globalDecay": 0.0, 
					"initialPerm": 0.21, 
					"permanenceDec": 0.1, 
					"seed": 1960, 
					"maxAge": 0, 
					"newSynapseCount": 20, 
					"maxSegmentsPerCell": 128, 
					"temporalImp": "cpp", 
					"inputWidth": 2048}, 
				"anomalyParams": {
					"anomalyCacheRecords": null, 
					"autoDetectThreshold": null, 
					"autoDetectWaitRecords": 5030}, 
				"spEnable": true, 
				"inferenceType": "TemporalAnomaly", 
				"tpEnable": true}
				}, 
				"inputRecordSchema": 
					[["c0", "datetime", "T"], 
					["c1", "float", ""]]
}

Has it sense? Have I got a wrong concept?

Regards