Encoding squence or word

I want to implement the encoding for a sequence and through HTM detect anomaly.
For example,the sequence is ABCD, when i input ABCFcan detect it has anomaly.How can i implement encoding for ABCD &ABCF,how can i convert it to SDR?

In addition,encoding for a word ,how can i convert a word to SDR?

1 Like

Are you sure that’s what you need? I’m pretty sure you can just pass in A,B,C,D at different steps and the model will interpret the sequence appropriately.

Use the category encoder: http://nupic.docs.numenta.org/stable/api/algorithms/encoders.html#category-encoders

1 Like

Please see our Quick Start docs. Once you get a sample application running, you can start trying to change the encoder to a CategoryEncoder. Here is another example of using a category encoder.

Thank you, i used categoryEncoder to predicted sequence yesterday and got the result,but the result not ideal.you can see follow:
|2017/11/8 0:59|A|A|
|2017/11/8 1:59|B|B|
|2017/11/8 2:59|C|C|
|2017/11/8 3:59|D|D|
|2017/11/8 4:59|E|E|
|2017/11/8 5:59|A|A|
|2017/11/8 6:59|B|B|
|2017/11/8 7:59|C|C|
|2017/11/8 8:59|D|D|
|2017/11/8 9:59|E|E|
|2017/11/8 10:59|A|B|
|2017/11/8 11:59|B|C|
|2017/11/8 12:59|C|D|
|2017/11/8 13:59|D|E|
|2017/11/8 14:59|E|A|
|2017/11/8 15:59|A|B|
|2017/11/8 16:59|B|C|
|2017/11/8 17:59|C|D|
|2017/11/8 18:59|D|E|
|2017/11/8 19:59|E|A|
|2017/11/8 20:59|E|A|
|2017/11/8 21:59|A|B|
|2017/11/8 22:59|B|C|
|2017/11/8 23:59|C|D|
|2017/11/9 0:59|D|E|
|2017/11/9 1:59|E|A|
|2017/11/9 2:59|A|B|
|2017/11/9 3:59|B|C|
|2017/11/9 4:59|C|D|
|2017/11/9 5:59|D|E|
|2017/11/9 6:59|E|A|
|2017/11/9 7:59|A|B|
|2017/11/9 8:59|B|C|
|2017/11/9 9:59|F|A|
|2017/11/9 10:59|C|D|
|2017/11/9 11:59|D|E|
|2017/11/9 12:59|E|A|
|2017/11/9 13:59|A|B|
|2017/11/9 14:59|B|F|
There are two anomal to the above data.the correct sequence is A-B-C-D-E,but when i input A-B-F-C-D-E-A-B,predicted sequence is A-B-C-A-D-E,i think this predicted result A where after C is error ,What can I do to get rid of the anomal category F make the prediction normally as A-B-C-D-E?

How many rows of data have you feed the model by the time it sees this data and makes these predictions?

When I align the predictions with the data, it doesn’t look like it is performing that poorly, especially if this is all the data it has seen so far:

thank you , There are thousands of lines of data for our’s, and the overall effect is still good. But i have a question ,When I encounter anomaly F in my sequence of A-B-C-D-E, is there any operation to eliminate the anomaly F or in other words F is a noise .What should I do to prevent the prediction from being affected by noise ?

Just based on the actual/predicted values you posted, it looks like the learning may be higher than you want. With your current settings, presumably entering the correct sequence A-B-C-D-E again a couple times should get it predicting C after A-B again. If you lower the permanence increment and decrement values, then noise/errors will have less impact on the predictions (at the cost of more iterations to learn new sequences).

That said, when F is encountered the first time as in your above data, it isn’t going to predict C, because C never followed A-B-F (so it never learned a sequence like A-B-F-C-D-E). However, if you were to input F again after the last row you listed, it would probably predict C (because this time it would remember the previous variation)

1 Like

thanks for the feedback,I will try the method as you said.