I’m new to HTM and I want to use it for anomaly detection to decide for a preventive maintenance.

I think that I have the basic understanding of the model but still can’t tell what technique should I use to chose the right variables to be fed to the model

How many variables are you considering? Its generally advised not to have more than 4 or 5 in a single model, and sometimes best to have a separate model for each one. There is a process for this called swarming which is a guided search for optimal parameter values, including which variables contribute to better predictions. Here’s a link to the docs on it:

Actually, I don’t have the data yet as I’m proposing HTM for anomaly detection as a solution for incident prediction problem, but I’m assuming the number variables will be much greater than 4 or 5.

But now I have two questions

First, what if the variables for some reason, each one separately (in a single model) won’t predict an anomaly (in any of the models) and in a multivariant model it would predict an anomaly?

Second, if I had a single model for each variable how would I combine the result of each model to have only one score for anomaly?

This is certainly possible. With separate models for each variable those models would only tell you if that one variable was acting strangely. Important to understand is that when multiple variables are included in a single model they are concatenated together and fed into the system, so the anomalies produced by that model would tell you that the collection of variables as whole was acting strangely, but it wouldn’t be possible to say which ones or which combinations are the culprits precisely. Ideally having both single-variate and multi-variate models could offer the most insight, so when the multi model was anomalous you could check the single models for which were doing so.

I don’t think there’s any rule of thumb for how to do this. I think your intuition as the one who understands what you’re modeling would be the way to go. Most simply you could average each of the different anomaly scores to get one overall anomaly metric.

for count, record in enumerate(reader):
if count >= numRecords: break
# Convert data string into Python date object.
dateString = datetime.datetime.strptime(record[0], "%m/%d/%y %H:%M")
# Convert data value string into float.
consumption = float(record[1])
# To encode, we need to provide zero-filled numpy arrays for the encoders
# to populate.
timeOfDayBits = numpy.zeros(timeOfDayEncoder.getWidth())
weekendBits = numpy.zeros(weekendEncoder.getWidth())
consumptionBits = numpy.zeros(scalarEncoder.getWidth())
# Now we call the encoders to create bit representations for each value.
timeOfDayEncoder.encodeIntoArray(dateString, timeOfDayBits)
weekendEncoder.encodeIntoArray(dateString, weekendBits)
scalarEncoder.encodeIntoArray(consumption, consumptionBits)
# Concatenate all these encodings into one large encoding for Spatial
# Pooling.
encoding = numpy.concatenate(
[timeOfDayBits, weekendBits, consumptionBits]
)

The numpy.concatenate here combines the separate encodings from the 3 fields (timeOfDayBits, weekendBits, consumptionBits)

I have a related question. Say one incorporates three variables, as in the Quickstart example. Does the amount of bits a variable occupies in the input space have a bearing on its “weighting” in the prediction? I presume it does but just want to make sure.

eg. timeOfDayBits has 300, weekendBits 50, consumptionBits 400, for an input space of dimensions 750.

Does weekendBits have a relatively minimal impact on the next predicted value?

It seems to me that the attribution of input bit percentage to encoded variables is a very important parameter to monitor.

I’ll investigate alternatives, though I’d rather not look to increase size. I wonder if it is possible to use the saccade vector as a means to shift or hash the feature space in a way that it can represent the change in direction and magnitude whilst still retaining the same semantic meaning of the features,… hmm interesting.

Stay tuned for papers coming out in October that talk about how movement vectors can get into the system as proximal input to a “location layer”. This approach is more complicated but more biologically plausible than just shoving movement data and sensory data into the same bit arrays.