Anomaly detection with HTM.core model on sine

Robbert_3 · September 22, 2021, 2:01pm

The last few weeks I have been busy with the HTM.core package to make a nice HTM model.
I didn’t quite get the results I wanted on the dataset and for that reason I am now trying to make a good model on an almost perfect sine function (with a flat line). However, the model does not predict the anomaly, the anomaly score does not go up and likelihood does not change in the pattern. In the meantime, I’ve tried a few things, but can’t get it to work properly.
Does anyone have an idea what I should change about the approach to get it working properly?

This is the code I used, most of it is from the hotgym example.

import numpy as np
import pandas as pd

from sklearn.metrics import mean_squared_error

import plotly.graph_objects as go
from plotly.subplots import make_subplots

from htm.bindings.sdr import SDR, Metrics
from htm.encoders.scalar_encoder import ScalarEncoder, ScalarEncoderParameters
from htm.bindings.algorithms import SpatialPooler
from htm.bindings.algorithms import TemporalMemory
from htm.algorithms.anomaly_likelihood import \
    AnomalyLikelihood  # FIXME use TM.anomaly instead, but it gives worse results than the py.AnomalyLikelihood now
from htm.bindings.algorithms import Predictor

x = np.linspace(0, np.pi*80, 8000)
y = np.sin(x) + 1

df = pd.DataFrame(data=zip(x, y), index=range(len(x)), columns=['x','y'])

# making the anomaly
df.iloc[6000:6100,:] = 0

print(df.shape)

parameters = {
    'predictor': {'sdrc_alpha': 0.05},
    'sp': { 
           'columnDimensions': (2048, )
          },
    'tm': {
        'columnDimensions':(2048, )
          },
    'anomaly': {
        'likelihood':
            { 
            'learningPeriod':288, # if None it will be calculated later, else value (indication is 500 for 5-min interval)
            'historicWindowSize':4000,  # default of 8640 is a month's worth of history at 5-minute intervals
            'probationaryPct': 0.1,
            'reestimationPeriod': 200,
            'estimationSamples':100}  # how often we re-estimate the Gaussian distribution
    }
}

#when changing the encoders
par = ScalarEncoderParameters()
par.activeBits = 41
par.minimum = -1
par.maximum = 3
par.size = 2048
scalarEncoder = ScalarEncoder(par)

encodingWidth = (scalarEncoder.size)
enc_info = Metrics([encodingWidth], 999999999)

spParams = parameters["sp"]

sp = SpatialPooler(
        inputDimensions=(encodingWidth,),
        columnDimensions=spParams["columnDimensions"])

sp_info = Metrics(sp.getColumnDimensions(), 999999999)

tmParams = parameters["tm"]

tm = TemporalMemory(
        columnDimensions=tmParams["columnDimensions"])

tm_info = Metrics([tm.numberOfCells()], 999999999)

step = 5
# setup likelihood, these settings are used in NAB
anParams = parameters["anomaly"]["likelihood"]
if anParams.get("learningPeriod") is None:
    probationaryPeriod = int(math.floor(float(anParams["probationaryPct"]) * df.shape[0]))
    learningPeriod = int(math.floor(probationaryPeriod / 2.0))
else:
    learningPeriod = anParams["learningPeriod"]
anomaly_history = AnomalyLikelihood(learningPeriod=learningPeriod,
                                    estimationSamples=anParams["estimationSamples"],
                                    reestimationPeriod=anParams["reestimationPeriod"],
                                    historicWindowSize=anParams["historicWindowSize"])

predictor = Predictor(steps=[1, step], alpha=parameters["predictor"]['sdrc_alpha'])
predictor_resolution = 0.1

inputs = []
anomaly = []
anomalyLikelihood = []
log_anomalyLikelihood = []
predictions = {1: [], step: []}
for count, record in enumerate(y):

    # Call the encoders to create bit representations for each value.  These are SDR objects.
    consumptionBits = scalarEncoder.encode(record)

    # Concatenate all these encodings into one large encoding for Spatial Pooling.
    encoding = consumptionBits
    enc_info.addData(encoding)

    # Create an SDR to represent active columns, This will be populated by the
    # compute method below. It must have the same dimensions as the Spatial Pooler.
    activeColumns = SDR(sp.getColumnDimensions())

    # Execute Spatial Pooling algorithm over input space.
    overlaps = sp.compute(encoding, True, activeColumns)
        
    sp_info.addData(activeColumns)
    
    # Execute Temporal Memory algorithm over active mini-columns.
    tm.compute(activeColumns, learn=True)
    tm_info.addData(tm.getActiveCells().flatten())

    # Predict what will happen, and then train the predictor based on what just happened.
    pdf = predictor.infer(tm.getActiveCells())
    for n in (1, step):
        if pdf[n]:
            predictions[n].append(np.argmax(pdf[n]) * predictor_resolution)
        else:
            predictions[n].append(float('nan'))

    anomaly_Likelihood = anomaly_history.anomalyProbability(record, tm.anomaly)
    anomaly.append(tm.anomaly)
    logAnomalyLikelihood = np.log(1.0000000001 - anomaly_Likelihood) / -23.02585084720009
    anomalyLikelihood.append(anomaly_Likelihood)
    log_anomalyLikelihood.append(logAnomalyLikelihood)
    
    predictor.learn(count, tm.getActiveCells(), int(record / predictor_resolution))

# Print information & statistics about the state of the HTM.
print("Encoded Input", enc_info)
print("")
print("Spatial Pooler Mini-Columns", sp_info)
print(str(sp))
print("")
print("Temporal Memory Cells", tm_info)
print(str(tm))
print("")

THRESHOLD = 0.2
df['anomaly'] = log_anomalyLikelihood

# Shift the predictions so that they are aligned with the input they predict.
for n_steps, pred_list in predictions.items():
    for x in range(n_steps):
        pred_list.insert(0, float('nan'))
        pred_list.pop()

# Calculate the predictive accuracy, Root-Mean-Squared
accuracy = {1: 0, step: 0}
accuracy_samples = {1: 0, step: 0}

for idx, inp in enumerate(df.y):
    for n in predictions:  # For each [N]umber of time steps ahead which was predicted.
        val = predictions[n][idx]
        if not np.isnan(val):
            accuracy[n] += (inp - val) ** 2
            accuracy_samples[n] += 1
for n in sorted(predictions):
    accuracy[n] = (accuracy[n] / accuracy_samples[n]) ** .5
    print("Predictive Error (RMS)", n, "steps ahead:", accuracy[n])
print("Random guess, mean temperature:")
print(mean_squared_error(df.y, [np.mean(df.y)]*df.y.shape[0])**0.5)

df_t = df[df.anomaly >THRESHOLD]

fig = make_subplots(specs=[[{"secondary_y": True}]])
    
fig.add_trace(go.Scatter(x=df.index, y=df.y, name='Sinus'), secondary_y=False)

fig.add_trace(go.Scatter(x=df_t.index, y=df_t.y, mode='markers', name='Anomaly'), secondary_y=False)

fig.add_trace(go.Scatter(x=df.index, y=predictions[1], name='prediction one step ahead'), secondary_y=False)

fig.add_trace(go.Scatter(x=df.index, y=predictions[step], name=f'prediction {step} step ahead'), secondary_y=False)

fig.add_trace(go.Scatter(x=df.index, y=anomaly, name='Anomaly score TM'), secondary_y=True)

fig.add_trace(go.Scatter(x=df.index, y=anomalyLikelihood, name='Anomaly Likelihood'), secondary_y=True)

fig.add_trace(go.Scatter(x=df.index, y=log_anomalyLikelihood, name='Log Likelihood', line_color='#ffe476'), secondary_y=True)

fig.add_trace(go.Scatter(x=df.index, y=np.array([THRESHOLD]*df.shape[0]), name='Threshold'), secondary_y=True)


fig.update_layout(autosize=False, width=1000, height=500)
fig.update_yaxes(
    title_text = "Sinus",
    title_standoff = 25,
    secondary_y=False)

fig.update_yaxes(
    title_text = "Anomaly score",
    title_standoff = 25,
    secondary_y=True)

fig.show()

Robbert_3 · September 22, 2021, 2:03pm

When zooming in, the metric will be like this:

dmac · September 23, 2021, 1:41pm

Hi,

I looked at your code and I see a few issues:

You’re missing some arguments to the SpatialPooler:

Try adding the following two arguments to the SpatialPooler’s init:

       potentialRadius = 2048,
       globalInhibition = True,

Here is how I debugged this: The following is part of the debugging printout that your program prints:

Spatial Pooler Connections:
    [...]
    Potential Synapses on Segment Min/Mean/Max 17 / 17 / 17
    Connected Synapses on Segment Min/Mean/Max 0 / 2.90918 / 17

The line starting with “Potential Synapses”: This means that each mini-column in your SpatialPooler can have at most 17 synapses.

The line starting with “Connected Synapses”: This means at the end of the program run, each mini-column in the spatial pooler had (on average) 2.9 synapses.

The argument ``potentialRadius` tells the SpatialPooler to form synapses between cells that are far away (as far away as this radius). By default this number is 16, which is much too small for your program.

It’s detecting anomalies before the first real anomaly. I would recommend removing the anomaly (or just ignoring it) and focusing on fixing the TM to work with non-anomalous data. If the TM does not recognize the non-anomalous data, then it can not detect the anomalies either.

You removed all of the parameters! There should have been a whole bunch of numbers in the parameters dictionary at the top of the file. If you want to play with these algorithms, then you will need to play with the parameters too.
Also, some of the parameters have bad (not useful) default values so you will need to specify them every time.
The documentation for the SpatialPooler and TemporalMemory algorithms has a full listing of the parameters.

You might have issues with the AnomalyLikelihood code not working correctly.
I suspect there are many bugs/issues in the HTM.Core implementation of it.
If your raw-anomaly is good but the likelyhood is bad, then it might not be your fault.
And if you report issues (either here or on the htm.core github page) then we can work together to improve it!

I hope this helps.
David

Robbert_3 · September 24, 2021, 9:22am

Hey David,

Thank you so much for the help! It really helped me a lot.

I changed the code a bit and the data as well to test if it works in more situations.

add some kind of date with encoding
added more anomaly windows with different anomalies
classified anomalies based on raw score instead of likelihood (suggested in point 4)
added some parameters voor de Spatial pooler and Temporal memory (suggested in point 1 and 3)

import numpy as np
import pandas as pd

from sklearn.metrics import mean_squared_error
from sklearn.metrics import log_loss

import plotly.graph_objects as go
from plotly.subplots import make_subplots

from htm.bindings.sdr import SDR, Metrics
from htm.encoders.scalar_encoder import ScalarEncoder, ScalarEncoderParameters
from htm.bindings.algorithms import SpatialPooler
from htm.bindings.algorithms import TemporalMemory
from htm.encoders.date import DateEncoder
from htm.algorithms.anomaly_likelihood import \
    AnomalyLikelihood  # FIXME use TM.anomaly instead, but it gives worse results than the py.AnomalyLikelihood now
from htm.bindings.algorithms import Predictor

x = np.linspace(0, np.pi*60, 8640)
y = np.sin(x) + 1

df = pd.DataFrame(data=zip(x, y, date), index=range(len(x)), columns=['x','y', 'date'])

# making label
df['label'] = 0

df.iloc[3700:3800,3] = 1
df.iloc[4100:4150,3] = 1
df.iloc[6000:6100,3] = 1
df.iloc[6800:7000,3] = 1

# making the anomaly
df.iloc[3700:3800,1] = 0.7
df.iloc[4100:4150,1] = 0.5
df.iloc[6000:6100,1] = 0.0
df.iloc[6800:7000,1] = np.random.random_sample([200,1])

print(df.shape)

parameters = {
    'predictor': {'sdrc_alpha': 0.05},
    'sp': { 
            'columnDimensions': (2048, ),
            'potentialRadius' : 2048,
            'globalInhibition' : True
          },
    'tm': {
        'columnDimensions': (2048, ), 
        'activationThreshold': 17,
        'cellsPerColumn': 13,
        'initialPerm': 0.21,
        'maxSegmentsPerCell': 128,
        'maxSynapsesPerSegment': 64,
        'minThreshold': 10,
        'connectedPermanence': 0.13999999999999999,
        'newSynapseCount': 32,
        'permanenceDec': 0.1,
        'permanenceInc': 0.1},
    'anomaly': {
        'likelihood':
            { 
            'learningPeriod':400, # if None it will be calculated later, else value (indication is 500 for 5-min interval)
            'historicWindowSize':4000,  # default of 8640 is a month's worth of history at 5-minute intervals
            'probationaryPct': 0.1,
            'reestimationPeriod': 100,
            'estimationSamples':100}  # how often we re-estimate the Gaussian distribution
    }
}

#when changing the encoders
par = ScalarEncoderParameters()
par.activeBits = 24
par.minimum = -1
par.maximum = 3
par.size = 2048*2
scalarEncoder = ScalarEncoder(par)

dateEncoder = DateEncoder(timeOfDay=(7,4), weekend=3, dayOfWeek=7)
encodingWidth = (dateEncoder.size + scalarEncoder.size)
enc_info = Metrics([encodingWidth], 999999999)

spParams = parameters["sp"]

sp = SpatialPooler(
        inputDimensions=(encodingWidth,),
        potentialRadius = spParams["potentialRadius"],
        globalInhibition = spParams["globalInhibition"],
        columnDimensions=spParams["columnDimensions"])

sp_info = Metrics(sp.getColumnDimensions(), 999999999)

tmParams = parameters["tm"]

# tm = TemporalMemory(
#         columnDimensions=tmParams["columnDimensions"])
tm = TemporalMemory(
        columnDimensions=tmParams["columnDimensions"],
        cellsPerColumn=tmParams["cellsPerColumn"],
        activationThreshold=tmParams["activationThreshold"],
        initialPermanence=tmParams["initialPerm"],
        connectedPermanence=tmParams["connectedPermanence"],
        minThreshold=tmParams["minThreshold"],
        maxNewSynapseCount=tmParams["newSynapseCount"],
        permanenceIncrement=tmParams["permanenceInc"],
        permanenceDecrement=tmParams["permanenceDec"],
        predictedSegmentDecrement=0.0,
        maxSegmentsPerCell=tmParams["maxSegmentsPerCell"],
        maxSynapsesPerSegment=tmParams["maxSynapsesPerSegment"]
        )

tm_info = Metrics([tm.numberOfCells()], 999999999)

step = 5
# setup likelihood, these settings are used in NAB
anParams = parameters["anomaly"]["likelihood"]
if anParams.get("learningPeriod") is None:
    probationaryPeriod = int(math.floor(float(anParams["probationaryPct"]) * df.shape[0]))
    learningPeriod = int(math.floor(probationaryPeriod / 2.0))
else:
    learningPeriod = anParams["learningPeriod"]
anomaly_history = AnomalyLikelihood(learningPeriod=learningPeriod,
                                    estimationSamples=anParams["estimationSamples"],
                                    reestimationPeriod=anParams["reestimationPeriod"],
                                    historicWindowSize=anParams["historicWindowSize"])

predictor = Predictor(steps=[1, step], alpha=parameters["predictor"]['sdrc_alpha'])
predictor_resolution = 0.1

inputs = []
anomaly = []
anomalyLikelihood = []
log_anomalyLikelihood = []
predictions = {1: [], step: []}
for count, record in df.iterrows():

    dateBits = dateEncoder.encode(record.date)
    consumptionBits = scalarEncoder.encode(record.y)

    # Concatenate all these encodings into one large encoding for Spatial Pooling.
    encoding = SDR(encodingWidth).concatenate([consumptionBits, dateBits])
    enc_info.addData(encoding)

    # Create an SDR to represent active columns, This will be populated by the
    # compute method below. It must have the same dimensions as the Spatial Pooler.
    activeColumns = SDR(sp.getColumnDimensions())

    # Execute Spatial Pooling algorithm over input space.
    overlaps = sp.compute(encoding, True, activeColumns)
        
    sp_info.addData(activeColumns)
    
    # Execute Temporal Memory algorithm over active mini-columns.
    tm.compute(activeColumns, learn=True)
    tm_info.addData(tm.getActiveCells().flatten())

    # Predict what will happen, and then train the predictor based on what just happened.
    pdf = predictor.infer(tm.getActiveCells())
    for n in (1, step):
        if pdf[n]:
            predictions[n].append(np.argmax(pdf[n]) * predictor_resolution)
        else:
            predictions[n].append(float('nan'))

    anomaly_Likelihood = anomaly_history.anomalyProbability(record.y, tm.anomaly)
    anomaly.append(tm.anomaly)
    logAnomalyLikelihood = np.log(1.0000000001 - anomaly_Likelihood) / -23.02585084720009
    anomalyLikelihood.append(anomaly_Likelihood)
    log_anomalyLikelihood.append(logAnomalyLikelihood)
    
    predictor.learn(count, tm.getActiveCells(), int(record.y / predictor_resolution))

# Print information & statistics about the state of the HTM.
print("Encoded Input", enc_info)
print("")
print("Spatial Pooler Mini-Columns", sp_info)
print(str(sp))
print("")
print("Temporal Memory Cells", tm_info)
print(str(tm))
print("")

THRESHOLD_LIKELIHOOD = 0.3
THRESHOLD_RAW_SCORE = 0.9
df['loglikelihood_anomaly'] = log_anomalyLikelihood
df['raw_anomaly_score'] = anomaly

# Shift the predictions so that they are aligned with the input they predict.
for n_steps, pred_list in predictions.items():
    for i in range(n_steps):
        pred_list.insert(0, float('nan'))
        pred_list.pop()

# Calculate the predictive accuracy, Root-Mean-Squared
accuracy = {1: 0, step: 0}
accuracy_samples = {1: 0, step: 0}

for idx, inp in enumerate(df.y):
    for n in predictions:  # For each [N]umber of time steps ahead which was predicted.
        val = predictions[n][idx]
        if not np.isnan(val):
            accuracy[n] += (inp - val) ** 2
            accuracy_samples[n] += 1
for n in sorted(predictions):
    accuracy[n] = (accuracy[n] / accuracy_samples[n]) ** .5
    print("Predictive Error (RMS)", n, "steps ahead:", accuracy[n])
print("Random guess, mean temperature:")
print(mean_squared_error(df.y, [np.mean(df.y)]*df.y.shape[0])**0.5)

df_t = df[(df.raw_anomaly_score >=THRESHOLD_RAW_SCORE)] # (df.loglikelihood_anomaly >=THRESHOLD_LIKELIHOOD) | 

fig = make_subplots(specs=[[{"secondary_y": True}]])
    
fig.add_trace(go.Scatter(x=df.index, y=df.y, name='Sinus'), secondary_y=False)

fig.add_trace(go.Scatter(x=df_t.index, y=df_t.y, mode='markers', name='Anomaly'), secondary_y=False)

fig.add_trace(go.Scatter(x=df.index, y=predictions[1], name='prediction one step ahead'), secondary_y=False)

fig.add_trace(go.Scatter(x=df.index, y=predictions[step], name=f'prediction {step} step ahead'), secondary_y=False)

fig.add_trace(go.Scatter(x=df.index, y=anomaly, name='Anomaly score TM'), secondary_y=True)

fig.add_trace(go.Scatter(x=df.index, y=anomalyLikelihood, name='Anomaly Likelihood'), secondary_y=True)

fig.add_trace(go.Scatter(x=df.index, y=log_anomalyLikelihood, name='Log Likelihood', line_color='#ffe476'), secondary_y=True)

fig.add_trace(go.Scatter(x=df.index, y=np.array([THRESHOLD_LIKELIHOOD]*df.shape[0]), name='Threshold_Likelihood'), secondary_y=True)
fig.add_trace(go.Scatter(x=df.index, y=np.array([THRESHOLD_RAW_SCORE]*df.shape[0]), name='Threshold_Raw_score'), secondary_y=True)


fig.update_layout(autosize=False, width=1000, height=500)
fig.update_yaxes(
    title_text = "Sinus",
    title_standoff = 25,
    secondary_y=False)

fig.update_yaxes(
    title_text = "Anomaly score",
    title_standoff = 25,
    secondary_y=True)

for i in [[3700,3800], [4100,4150], [6000,6100], [6800,7000]]:
    fig.add_vrect(
        x0=i[0], x1=i[1],
        fillcolor="LightSalmon", opacity=0.5,
        layer="below", line_width=0,
    )

fig.show()

it is doing much better as you can see in this picture.

However, there is still one anomaly which is not detected. This anomaly is at point 6000 where the sine is already low and the anomaly is a flat line of 0. Looking more closely to this, the prediction adjust very quickly (too quickly in my opinion).

Is there any tip or trick that maybe fix this problem?

dmac · September 24, 2021, 11:56am

The plot looks much better!

I don’t think the date-encoder will help you because your data does occur at any specific time of day or day of week. That encoder is useful for working with events that occur at specific times/dates.

I think that these two things (the end of the good data and the start of the anomaly) are too close together for the encoder to show any difference. Those inputs are close enough together that the encoder outputs either the same or very similar encodings. Try increasing the resolution of the encoder: you can do this by just increasing the total size of the encoder. I doubled it and now it detects that jump as an anomaly.

Yes, because you have the following two TM parameters:

    'initialPerm': 0.21,
    'connectedPermanence': 0.13999999999999999,

Those parameters control how much time it takes for a new synapse to go from unconnected to connected. With your parameters the synapses start connected. So when the TM encounters the anomaly: it makes new synapses to predict the anomaly, and those synapses are immediately connected, so it immediately recognizes the anomaly after a single timestep.

If you changed your code to make synapses start in a disconnected state, then it would take longer to adjust to the anomaly, but it would also take longer to learn in general. There are also several other tuning parameters that control how synapses grow, die, and are strengthen/weakened by stimulus.

I hope this helps!

Robbert_3 · September 24, 2021, 3:01pm

Thank you again David!

You have a valid point on the date. I changed the code of the last post where the data is now correlated with the date. There are 30 sine movements in 8640 points, which corresponds with 30 days of 5 minutes data. I also changed the size of the encoder as you suggested and increased the connectedPermanence so the synapse is not instantly connected.
Now the results are indeed better (after some fase of training).

After this, I changed the y to y = np.sin(x) + 1 + (np.random.random_sample([8640,])/5) to add a little bit of noise, expecting some extra anomalies, but now the screen is full of anomalies. Are there any parameters or ways to go for an HTM model which looks more at the temporal trend than the changes for the next step? Since the added noise should represent a more real-world example than the perfect sine.

dmac · September 25, 2021, 12:38pm

I think that the problem with that noise is that its magnitude is too large. Earlier you made an anomaly which was a small change to data signal, and the way to make the HTM detect it was to increase the sensitivity of the sensor/encoder. But now you have noise that is larger than the “anomaly” but you want to ignore the noise.

The HTM can either ignore the noise in its inputs, or it can flag noisy input as anomalous. The parameters (especially the encoder and spatial-pooler parameters) will control how the sensitivity to noise. However one HTM can not both ignore an event and also flag it as an anomaly!

Some amount of tuning the parameters will (probably) get you an acceptable compromise between noise tolerance and anomaly detection.

I hope this helps.

Robbert_3 · September 27, 2021, 7:06am

Okay, it is understandable that the noise is quiet large for a ‘not anomaly’, but how can we influence the detection of trends? because some of the anomalies are in a changing trend. This should still be possible to detect, isn’t it?

I will try to change the parameters to be less sensitive to the noise! Thanks David!

jacobeverist · September 28, 2021, 3:50pm

It does detect the anomaly when it first occurs. It then learns the new normal and dutifully stops reporting anomalies. If you want to change this behavior, you need to turn off learning if you don’t want it to learn anomalous behaviors.

Reporting anomalies and learning the input patterns are tasks that are in conflict with each other. Everything is an anomaly until you learn it, and then it becomes normal behavior.

Normally it makes sense to train on normal data first, and then turn off learning while detecting for anomalies.

Robbert_3 · September 29, 2021, 12:40pm

Thank you for your reply!

In this situation it does not detect anomaly since it is too close to the normal trend, at first. But the flat line continued (still no anomaly detected) and this new trend, a flat line, is completely different to the normal sine (which is learned). In this situation, giving an anomaly is the desired behavior.

Normally this makes sense, but the power of an HTM model is to never stop learning when a stream of data stops or is this a misunderstanding of me ?

jacobeverist · October 3, 2021, 4:45pm

Sure. This is true. But like I said:

HTM never claimed to solve these tasks simultaneously without conflict. In fact, I’m pretty sure no one has solved this yet.

Well, according to the image below, it definitely spikes the anomaly score so you can’t say it didn’t detect the transition. It’s just that your threshold is too high. Actually, it looks like it passes your threshold when it transitions back to normal. So you did detect an anomaly given your current configuration.

You will likely experience better results if you reduce your sample rate since the amount of change between steps is so small that it won’t flag an anomaly unless the change is massive. Without significant step-to-step change, the temporal memory won’t learn a lot of contextual sequence information without multi-step lookahead. However, doing that makes things very slow and explodes the complexity.

Try reducing sample rate by T/2, T/4, T/8 and see how the performance changes.

CollinsEM · October 4, 2021, 10:43am

A simple test can tell you if the (relatively) lower anomaly score at the start of the flat interval was due to the small deviation from the learned sequence. Just shift the window to a different phase of the sine wave or change the constant value that it transitions into such that there is a larger jump in value. Check to see if the anomaly score is proportional to the magnitude of the transition. (This would also explain why the anomaly score was higher at the end of the window.)

My intuition for why you only see one brief spike in anomaly score is because the sequence memory input only has a memory of the most recent iteration. As such it is acting like a high pass filter (i.e. detecting sudden changes in the signal but ignoring longer term trends).

To remedy this (and to make it less sensitive to short bursts of sensor noise) you may need to extend the duration of the distal dendrites’ influence, perhaps by allowing the prediction signal to decay over several iterations rather than just disappear in a single iteration.

tuguohong · December 6, 2021, 10:45am

Hi Robbert_3, could you please paste your final code for my reference? I am searching for a solution for learning a semiconductor diffusion like this:

I want to detect whole phase temperature errors (rampup from 500 to 1200, then last for 20hours and then rampdown to 500 centigrade.

tuguohong · December 6, 2021, 10:46am

sorry, this picture:

Yoni_Cohen · December 8, 2021, 9:37am

HI @Robbert_3 can you pls post here your final properly working code?
Thanks

Topic		Replies	Views
Question about htm.core anomaly detection NuPIC Community Fork anomaly-detection , question , htm-core	6	941	June 25, 2021
Regarding anomaly detection in HTM.core Talks and Events question	0	316	August 13, 2022
Anomaly Detection with htm.core for hot gym data Engineering htm-core	1	986	August 13, 2022
Parameter optimization with htm.core NuPIC Community Fork	2	485	October 7, 2019
Code to measure the accuracy in HTM.Core Implementations	0	105	June 17, 2024

Anomaly detection with HTM.core model on sine

Related topics