How to Encode IDS Alert log into SDR?

Hi All,
I’m actually new to HTM and NUPIC, trying to encode a group of IDS alert into a single SDR , the group of the IDS alerts(6 the most per group) describe some sort of the network intrusion action. So, I want to encode the intrusion action into a SDR, each of the alert that comprise the intrusion action is denoted as <timestamp, attack type, sourceIP,SourcePort, destIp,destPort>. And I want to discover the inner correlation between the intrusion actions , any conceptive thinking will be appreciated.

Thanks and Regards,
Zark.

We have datetime encoding strategies already pre-defined. The others could be categories, but we’d need to learn more about what they represent. Have you seen this? It shows how to record numbers in different ways. You can combine encodings together to combine semantics in an input space.

1 Like

timestamp, I can use DateTime Encoder,
attack type, is a string ,but I can convert it to its responding type ID witch is a integer, I can use the scalar Encoder,
SourceIP or DestIp , the Ip address like “192.168.20.3” ,
SourcePort or DestPort, is computer port , a number ranges 0-65535

Should we firstly translate the each type of the data into different numbers and then use the scalar encoder to encode them? Is it the recommended way?

I have the second question:

I want to discover the correlation between different attack actions in order to predict the next few action sequence the attacker may behave in the future. It seems HTM can predict only one time step ahead, is there any way to resolve this?

If I asked you to recite the English alphabet, one char at a time, you could. Easily.

If I said do it only speaking every fourth character - it would be difficult.

A, E, I, M, Q, U, Y.

You might, under you breath, have been saying “bcd, fgh, jkl, nop, rst, vwx, z”.

With practice - with more samples - the gaps between each “step” could change scales.

HTM has tunable parameters to play with “gap size”.

I am guessing this is what you were thinking of when you said “single step” ?

There’ll be examples of that on the Building HTM example page.

Have you seen AIOps from grokstream.com ? They implement a version of HTM for IT related purposes.

Thanks alot for rapidly replys.
The reply gives me a great inpiration and a nice reference.

There is a 3rd question:
I am not very familiar with the Temporal Memory. Is it suitable to resolve problem like this: export a learned pattern into files so we can persistent patterns and reload them to recognize different patterns in realtime.
In network security field, there are many different attack scenarios to learn and we need to persistent the learned attack patterns. It is a problem to store all the patterns to the higher level of HTM in run time.

If you encode your input data properly with semantics, the TM can give you live indications of how anomalous the current state is. It is harder to classify the sequence, but you can do it if you have lots of labeled examples of states and you train a model on each state (the model with the lowest anomaly is the classification). The predictions from the TM are not usually accurate unless you have a very stable pattern. Judging from your description of your data, I doubt you will be successful with predicting future patterns. If you can focus on anomalies you’ll have the most success.

Speaking of semantics, as you decide how to encode your data, you should definitely read Encoding Data for HTM Systems. Some of the fields you mention could be categories, but should they have overlap? That’s something you need to decide.