How does SP and TM parameters affect anomaly detection precision?

I am building an anomaly detector using HTM community fork and Spark. As a first step I am running this anomaly detector over NAB to see if it is able to catch the anomalies in the data.
I am not able to decide what parameters to set for encoders, spatial pooler and temporal memory. After some digging, I found this post https://discourse.numenta.org/t/dont-swarm-for-anomaly-models/915 by @rhyolight. Essentially it sets minValue, maxValue and resolution for the encoder and uses this file to set the parameters for spatial pooler and temporal memory. Does this mean that we can start the anomaly detection with the above mentioned parameters on any data set?
Even after copying the parameters from above the anomaly detection is not happening as intended. How should I go about it?
PS: I am intending to catch temporal as well as spatial anomalies.

1 Like

We have found that it works very well on the vast majority of scalar data sets.

Are you using a raw anomaly score or also calculating an anomaly likelihood? The likelihood gives you something you can apply a threshold, which can classify what an anomaly is.

What are you expecting, vs what is really happening?

I am using log anomaly likelihood.

As mentioned in this file, do we have to keep the radius for timeOfDay as 9.49? Shouldn’t this represent the time difference (in hours) between two consecutive data points? For example: All the AWSCloudwatch readings in NAB dataset are 5 mins apart therefore I was keeping the radius for timeOfDay as 0.08. Also, do we always keep dayofWeek and weekend null?

Please see the following screenshot:


Graph1: Highlights the anomaly region as provided by NAB result data.
Graph2: Plots the anomaly scores as provided by NAB result data.
Graph3: Plots the anomaly scores calculated by my anomaly detector.

I am using following code to generate the results:

from pyspark import SparkContext

import math
import datetime
import numpy as np
import matplotlib.pyplot as plt

from htm.bindings.sdr import SDR, Metrics
from htm.encoders.rdse import RDSE, RDSE_Parameters
from htm.bindings.encoders import ScalarEncoder, ScalarEncoderParameters
from htm.encoders.date import DateEncoder
from htm.bindings.algorithms import SpatialPooler
from htm.bindings.algorithms import TemporalMemory
from htm.algorithms.anomaly_likelihood import \
    AnomalyLikelihood  # FIXME use TM.anomaly instead, but it gives worse results than the py.AnomalyLikelihood now
from htm.bindings.algorithms import Predictor

sc = SparkContext("local", "Anomaly Detection")

parameters = {
    'enc': {
        "rdse":
            {'resolution': 0.001, 'activeBits': 21, 'size': 700, 'sparsity': 0.02},
        "scalar":
            {'minimum': 0, 'maximum': 100, 'activeBits':21, 'resolution': 0.001},
        "time":
            {'timeOfDay': (21, 9.49)}
    },

    'predictor': {'sdrc_alpha': 0.035828933612157998},

    'sp': {'boostStrength': 0.0,
           'columnCount': 2048,
           'localAreaDensity': 0.01953125,
           'seed': 1956,
           'potentialPct': 0.8,  # % of bits (in SDR) connected to one spatial pooler bit
           'synPermActiveInc': 0.003,
           'synPermConnected':0.2,
           'synPermInactiveDec': 0.0005},

    'tm': {'activationThreshold': 13,
           'cellsPerColumn': 32,
           'initialPerm': 0.21,
           'seed': 1960,
           'maxSegmentsPerCell': 128,
           'maxSynapsesPerSegment': 32,
           'minThreshold': 10,
           'newSynapseCount': 20,
           'permanenceDec': 0.1,
           'permanenceInc': 0.1},

    'anomaly': {
        'likelihood':
            {  # 'learningPeriod': int(math.floor(self.probationaryPeriod / 2.0)),
                # 'probationaryPeriod': self.probationaryPeriod-default_parameters["anomaly"]["likelihood”][“learningPeriod"],
                'learningPeriod': 500,
                'reestimationPeriod': 100
            }
    }
}

useRdseEncoder = True
useScalarEncoder = False

if( useRdseEncoder and useScalarEncoder ):
    raise Exception
elif((not useScalarEncoder) and (not useRdseEncoder)):
    raise Exception

dateEncoder = DateEncoder(timeOfDay=parameters["enc"]["time"]["timeOfDay"])

rdseEncoderParams = RDSE_Parameters()
rdseEncoderParams.size = parameters["enc"]["rdse"]["size"]
# rdseEncoderParams.sparsity = parameters["enc"]["rdse"]["sparsity"]
rdseEncoderParams.resolution = parameters["enc"]["rdse"]["resolution"]
rdseEncoderParams.activeBits = parameters["enc"]["rdse"]["activeBits"]
rdseEncoder = RDSE(rdseEncoderParams)

scalarEncoderParams = ScalarEncoderParameters()
scalarEncoderParams.minimum = parameters["enc"]["scalar"]["minimum"]
scalarEncoderParams.maximum = parameters["enc"]["scalar"]["maximum"]
scalarEncoderParams.activeBits = parameters["enc"]["scalar"]["activeBits"]
scalarEncoderParams.resolution = parameters["enc"]["scalar"]["resolution"]
scalarEncoder = ScalarEncoder(scalarEncoderParams)

valueEncoder = None
if useScalarEncoder:
    valueEncoder = scalarEncoder
else:
    valueEncoder = rdseEncoder

encodingWidth = (dateEncoder.size + valueEncoder.size)
enc_info = Metrics([encodingWidth], 999999999)

# Make the HTM.  SpatialPooler & TemporalMemory & associated tools.
spParams = parameters["sp"]
sp = SpatialPooler(
    inputDimensions=(encodingWidth,),
    columnDimensions=(spParams["columnCount"],),
    potentialPct=spParams["potentialPct"],
    potentialRadius=encodingWidth,
    globalInhibition=True,
    localAreaDensity=spParams["localAreaDensity"],
    seed=spParams["seed"],
    synPermInactiveDec=spParams["synPermInactiveDec"],
    synPermActiveInc=spParams["synPermActiveInc"],
    synPermConnected=spParams["synPermConnected"],
    boostStrength=spParams["boostStrength"],
    wrapAround=True
)
sp_info = Metrics(sp.getColumnDimensions(), 999999999)

tmParams = parameters["tm"]
tm = TemporalMemory(
    columnDimensions=(spParams["columnCount"],),
    cellsPerColumn=tmParams["cellsPerColumn"],
    seed=tmParams["seed"],
    activationThreshold=tmParams["activationThreshold"],
    initialPermanence=tmParams["initialPerm"],
    connectedPermanence=spParams["synPermConnected"],
    minThreshold=tmParams["minThreshold"],
    maxNewSynapseCount=tmParams["newSynapseCount"],
    permanenceIncrement=tmParams["permanenceInc"],
    permanenceDecrement=tmParams["permanenceDec"],
    predictedSegmentDecrement=0.0,
    maxSegmentsPerCell=tmParams["maxSegmentsPerCell"],
    maxSynapsesPerSegment=tmParams["maxSynapsesPerSegment"]
)
tm_info = Metrics([tm.numberOfCells()], 999999999)

# setup likelihood
anParams = parameters["anomaly"]["likelihood"]
# probationaryPeriod = anParams["probationaryPeriod"]
learningPeriod = anParams["learningPeriod"] #int(math.floor(probationaryPeriod / 2.0))
anomaly_history = AnomalyLikelihood(learningPeriod=learningPeriod,reestimationPeriod=anParams["reestimationPeriod"])

print("======================================================")
print("MODELS CREATED")
print("======================================================")

# Iterate through every datum in the dataset, record the inputs & outputs.
inputs = []
anomaly = []
anomalyProb = []
anomalyProb_fromResult = []
label_fromResult = []

SPATIAL_TOLERANCE = 0.05

minValueSeenSoFar = None
maxValueSeenSoFar = None

records = sc.textFile('NAB/results/numenta/realAWSCloudwatch/numenta_rds_cpu_utilization_e47b3b.csv').map(lambda line: (line.split(',')[0], line.split(',')[1], line.split(',')[2], line.split(',')[3], line.split(',')[4])).collect()

print("======================================================")
print("DATA LOADED")
print("======================================================")

count = 0
for record in records:
    # Convert date string into Python date object.
    if count > 0: 		# to exclude the headers
        dateString = datetime.datetime.strptime(record[0], "%Y-%m-%d %H:%M:%S")
        cpu = float(record[1])
        inputs.append(cpu)

        anomalyProb_fromResult.append(float(record[2]))
        label_fromResult.append(float(record[4]))

        # Call the encoders to create bit representations for each value.  These are SDR objects.
        dateBits = dateEncoder.encode(dateString)
        cpuBits = valueEncoder.encode(cpu) #rdseEncoder.encode(cpu)

        # Concatenate all these encodings into one large encoding for Spatial Pooling.
        encoding = SDR(encodingWidth).concatenate([cpuBits, dateBits])

        enc_info.addData(encoding)

        # Create an SDR to represent active columns, This will be populated by the
        # compute method below. It must have the same dimensions as the Spatial Pooler.
        activeColumns = SDR(sp.getColumnDimensions())

        # Execute Spatial Pooling algorithm over input space.
        sp.compute(encoding, True, activeColumns)
        sp_info.addData(activeColumns)

        # Execute Temporal Memory algorithm over active mini-columns.
        tm.compute(activeColumns, learn=True)
        tm_info.addData(tm.getActiveCells().flatten())

        # Update min/max values and check if there is a spatial anomaly
        spatialAnomaly = False
        if minValueSeenSoFar != maxValueSeenSoFar:
            tolerance = ( maxValueSeenSoFar - minValueSeenSoFar ) * SPATIAL_TOLERANCE
            maxValueExpected = maxValueSeenSoFar + tolerance
            minValueExpected = minValueSeenSoFar - tolerance
            if cpu > maxValueExpected or cpu < minValueExpected:
                spatialAnomaly = True
        if maxValueSeenSoFar is None or cpu > maxValueSeenSoFar:
            maxValueSeenSoFar = cpu
        if minValueSeenSoFar is None or cpu < minValueSeenSoFar:
            minValueSeenSoFar = cpu

        anomalyLikelihood = anomaly_history.anomalyProbability(cpu, tm.anomaly)

        logAnomalyScore = math.log(1.0000000001 - anomalyLikelihood) / -23.02585084720009
        if spatialAnomaly:
            anomalyProb.append(1.0)
        else:
            anomalyProb.append(logAnomalyScore)
    count = count + 1

print("======================================================")
print("MODEL HAS BEEN FED WITH THE DATA")
print("======================================================")

plt.subplot(3, 1, 1)
plt.title("Label - Result")
plt.xlabel("Time")
plt.ylabel("Value")
inputs = np.array(inputs) / max(inputs)
plt.plot(np.arange(len(inputs)), inputs, 'red',
         np.arange(len(inputs)), label_fromResult, 'blue',
         )
plt.legend(labels=('Input', 'Label - Result'))

plt.subplot(3, 1, 2)
plt.title("Anomaly Scores - Result")
plt.xlabel("Time")
plt.ylabel("Value")
inputs = np.array(inputs) / max(inputs)
plt.plot(np.arange(len(inputs)), inputs, 'red',
         np.arange(len(inputs)), anomalyProb_fromResult, 'blue',
         )
plt.legend(labels=('Input', 'Anomaly Scores - Result'))

plt.subplot(3, 1, 3)
plt.title("Log Anomaly Scores - Detection")
plt.xlabel("Time")
plt.ylabel("Value")
inputs = np.array(inputs) / max(inputs)
plt.plot(np.arange(len(inputs)), inputs, 'red',
         np.arange(len(inputs)), anomalyProb, 'blue',
         np.arange(len(inputs)), np.array([0.3]*len(inputs)), 'green'
         )
plt.legend(labels=('Input', 'Log Anomaly Scores - Detection', 'Threshold'))

plt.show()

Feel free to change these things. They are generic settings. You’ll do a better job tuning them to your data if you know the dataset.

Regarding your results, I agree you should see an anomaly where you are not seeing one. However, you are comparing two different HMT implementations. When the community forked NuPIC, we at Numenta were clear that we could not vet that the algorithms in the fork could be compared directly with the algorithms in NuPIC (that are run in NAB). Any model param setting change or difference, or a default setting in htm.core, could be altering results. There are a lot of places to troubleshoot, and I’m not sure where to start helping you, especially since I have not worked on htm.core.

At this point, I have decided to move back to Nupic since it has better support from you guys.

Is my understanding right that the radius should be the time difference in hours between two data points?

I read in one of your posts that one TM detector cannot learn short term and long term patterns. So what should be the ideal frequency of data to learn the long term patterns. My data is expected to have daily patterns with weekdays being different.

The maxVal and minVal in

getScalarMetricWithTimeOfDayAnomalyParams

is used to calculate the resolution for RDSE. Should minVal and maxVal be the extreme ends of the values the data can take or the min and max of the dataset. Example: let’s say we are running the detector on cpu utilization data therefore the min and max values this data can take is 0 and 100 respectively but the dataset might only have values between 30 to 79. In this case what should be the minVal and maxVal?

1 Like

I treat this as a hyper-parameter to the whole system, sample a few of the first rows to get a sense of what min/max granularity makes sense roughly w/ percentiles. Here’s the source for the getScalarMetrics… RDSE param generator:

It sets the min/max range for the resolution by +/- 1 sigma on the max/min found in ‘metricData’, with a .001 floor.

1 Like

You should read this: NuPIC & Python 2 end of life

@rhyolight thanks for the headsup. I have been using docker for my experiments so I guess I am good for now until I think of a new strategy.

Is there any documentation on htm.core that I can refer to for better understanding?

any comments on this??

What post? I don’t think this is true. One TM can learn all kinds of patterns and sub-patterns, it just doesn’t know when any of them actually end unless you manually tell it with a reset().

I meant that it cannot learn both at the same time. Can it?

Yes, it can learn long and short patterns at the same time.

2 Likes

Hi, thanks for the clarification!

What does the numBuckets parameter represent there? I cant find anything in the docs…

Any hint is highly appreciated!

The encoders quantize the input into Buckets. The numBuckets would be the number of possible values the encoder could produce.

I assume you are using htm.core. Where did you find that parameter? I don’t see it in the TM.