Feedback on network setup for anomaly detection

anomaly-detection
network

#1

Hi,
I’m looking for tuning my network to perform anomaly detection.
I’d like to have a feedback on the parameters I set to understand if I made errors.
An abstract of my dataset with computed anomaly score with the following network:

{read=-1.6133}  1.0
{read=-1.4906}  1.0 
{read=-1.4457}  1.0  
{read=-1.614}   1.0  
{read=-1.4786}  1.0  
{read=-1.5172}  1.0  
{read=-1.5228}  1.0  
{read=-1.5103}  1.0  
{read=-1.5214}  1.0  
{read=-1.4453}  1.0  
{read=-1.5975}  1.0  
{read=-1.5169}  1.0  
{read=-1.5388}  1.0  
{read=-1.5082}  0.825  
{read=-1.5004}  0.575 
{read=-1.5275}  0.575  
{read=-1.5414}  0.0  
{read=-1.4983}  0.0  
{read=-1.5007}  0.0 
{read=-1.5892}  0.725  
{read=-1.4596}  1.0  
{read=-1.5368}  0.0  
{read=-1.5171}  0.0  
{read=-1.4069}  0.575  
{read=-1.674}   1.0  
{read=-1.3818}  1.0  
{read=-1.4626}  0.575  
{read=-1.5901}  1.0  
{read=-1.3821}  1.0  
{read=-1.6325}  0.725  
{read=-1.4817}  1.0  
{read=-1.5102}  0.0  
{read=-1.6279}  0.725  
{read=-1.4113}  1.0  
{read=-1.5805}  0.725  
{read=-1.5106}  1.0  
{read=-1.5508}  0.0  
{read=-1.5027}  0.0  
{read=-1.4936}  0.0  
{read=-1.5706}  0.725  
{read=-1.5172}  0.575  
{read=-1.5215}  0.0  
{read=-1.4843}  0.0  
{read=-1.6138}  0.725 
{read=-1.4353}  0.575  
{read=-1.5029}  1.0  
{read=-1.5222}  0.0 
{read=-1.4888}  0.0  
{read=-1.614}   0.725  
{read=-1.4927}  0.575  
{read=-1.5226}  0.0  
{read=-1.5221}  0.075  
{read=-1.5485}  0.0  
{read=-1.503}   0.0  
{read=-1.5126}  0.0  
{read=-1.4889}  0.0  
{read=-1.5655}  0.725  
{read=-1.4901}  0.575  
{read=-1.4312}  0.575  
{read=-1.6286}  1.0  
{read=-1.4448}  0.575  
{read=-1.5415}  1.0  
{read=-1.4528}  0.575  
{read=-1.5515}  1.0  
{read=-1.6427}  0.725  
{read=-1.4883}  0.575  
{read=-1.5322}  0.0  
{read=-1.5104}  0.0  
{read=-1.5559}  0.0  
{read=-1.4451}  0.575  
{read=-1.5222}  1.0  
{read=-1.5423}  0.0  
{read=-1.5553}  0.0  
{read=-1.5364}  0.0  
{read=-1.504}   0.0  
{read=-1.6713}  0.8  
{read=-1.3651}  0.575  
{read=-1.5949}  1.0  
{read=-1.5317}  0.0  
{read=-1.4043}  0.575  
{read=-1.691}   1.0  
{read=-1.3486}  0.85  
{read=-1.5586}  1.0  
{read=-1.6381}  0.725  
{read=-1.4351}  0.575  
{read=-1.5947}  1.0  
{read=-1.5174}  0.0  
{read=-1.4695}  0.0  
{read=-1.6138}  0.725  
{read=-1.4445}  0.575  
{read=-1.5078}  0.875  
{read=-1.5951}  0.55  
{read=-1.4976}  0.575  
{read=-1.5485}  0.0  
{read=-1.5369}  0.0  
{read=-1.5396}  0.0  
{read=-1.5561}  0.0  
{read=-1.522}   0.0  
{read=-1.4624}  0.575  
{read=-1.5562}  1.0

They are sensor readings so I have a read each 20ms.
While the parameters:

n = 64
w = 3
min = 0
max = 0
radius = 0
resolution = 0.1
periodic = FALSE
clip = null
forced = null
fieldType = float
encoderType = RandomDistributedScalarEncoder

GLOBAL_INHIBITION: true
COLUMN_DIMENSIONS: 2048
CELLS_PER_COLUMN: 32
NUM_ACTIVE_COLUMNS_PER_INH_AREA: 40
POTENTIAL_PCT: 0.8
SYN_PERM_CONNECTED: 0.2
SYN_PERM_ACTIVE_INC: 0.003
SYN_PERM_INACTIVE_DEC: 0.0005
MAX_BOOST: 1
MAX_NEW_SYNAPSE_COUNT: 20
INITIAL_PERMANENCE: 0.24
PERMANENCE_INCREMENT: 0.04
PERMANENCE_DECREMENT: 0.008
MIN_THRESHOLD: 13
ACTIVATION_THRESHOLD: 20
MAX_NEW_SYNAPSE_COUNT: 128
PREDICTED_SEGMENT_DECREMENT: 0.001
CLIP_INPUT: false

Honestly I set just some of these parameters because I have no enough experience and knowledge for everyone so I set most following network examples or advise from documentation.
Can I have a feedback about the network set and if in your opinion the anomaly scores are plausible?


#2

I tried to play a bit with parameters, especially with resolution, n and w ones. My goal is also to obtain prediction values together with anomaly scores.

Reading at htm post it seems that n should be 300-400 while w around 21.
Moreover, about resolution, focusing on my data probably a better choice would be 0.01 instead of 0.1.

So using those probably I fits better my dataset obtaining higher anomaly scores. I’ve also noticed that n = 400 make the algorithm a bit slow. Which are the criteria to choose n and w?

In order to obtain also predicted values I printed also:
getClassification("read").getMostProbableValues(1)
but I obtain very strange values… sometimes null for example, sometimes already the same fixed value.


#3

We talked about a lot of your questions in the latest Hackers’ Hangout:

Also, your model params look out of date (the MAX_BOOST param is not BOOST_STRENGTH). Use these model params as a starting point.


#4

I’m watching the video. There is an explanation for the null value for the prediction? I modified the parameters with those described int your link but still appears a lot of null.


#5

I’m not sure you’re getting predictions correctly. Have you walked through the Network API Quickstart?


#6

I’m using essentially HTM.Java so I’m not referring to that document for implementation specifically.
I searched on forum and found this htm.java post which seems that the unique way to get predictions with java is the getMostProbableValue method.


#7

Moved from #nupic into #htm-java.


#8

Hi @Andrea_Giordano,

For HTM.Java, disregard the parameter reference @rhyolight linked you to, and just use the parameters that are shown in the Parameters.java file. Refer to this Quick Start Guide instead.

To include prediction, make sure you have these set in your Parameters (where the Parameters variable name is “p”):

  1. p.set(KEY.AUTO_CLASSIFY, true);
  2. p.set(KEY.INFERRED_FIELDS, getInferredFieldsMap(“consumption”, CLAClassifier.class));
    as of the newest version, you can specify what the fields are that you want to “predict” and what Classifier to use. Line #2 above refers to this feature.

The getInferredFieldsMap() method is a convenience static method off of the NetworkTestHarness class.


#9

Thank you cogmission,
I re-set parameters looking at Parameters.Java file but now I obtain almost always 1.0 as anomaly score :confused:

About prediction, I’ve already set the parameters you said, but anyway getMostProbableValue(1) sometimes returns me null value


#10

Can you output Connections.printParameters() from your Network? Make sure potentialRadius is set to your input width, if you’re using global inhibition ?

Otherwise, at this point, I would say that it is a function of your parameters. You’ll have to experiment with those…


#11

global inhibition was false, but anyway potentialRadius was equal to w.
Anyway as I said in another thread I’m investigating other parameters I found in the updated flink-htm I’m using.

Can you just confirm that to get predictions value the correct method to call is:

getClassification("field").getMostProbableValue(1) ?


#12

Yes definitely that’s the way it is retrieved.

You can verify this in a number of places in tests such as here.

By the way, you want global inhibition to be true for most cases (and it is much faster).

Cheers,
David


#13

I’m experimenting with parameters but I believe I obtain really too much high anomaly scores.
I tried to submit my dataset to HTM studio and I obtained coherent results. I see HTM studio find automatically the best params for the data, is there a way in HTM.java to implement it or to get the parameters used by HTM studio?