Finding the predictability of a pattern of anomalies with HTM

artifex · April 8, 2019, 10:35am

Hi,

I’m trying to use NuPic to detect possible cyber attacks. The basic idea that I’ve been working on was to measure lots of metrics that measure certain “features” of the way users, computers and networks work within the organization, then send those metrics both to HTM models as well as to other algorithms that will detect anomalies in the behaviour of single metrics (mostly based on some statistical approach) and have each of these components report anomalies it detects to another service that will look at specific time windows (let’s say a day) and send to an HTM model some sort of an array that indicates which metrics have shown an anomalous behaviour during the day and its magnitude and which metrics haven’t shown any sign of anomalous behaviour and have this service detect whether this pattern of metrics anomalies during the day is in itself anomalous and unpredictable or that this pattern of anomalies is actually predictable. The thing is that the number of such metrics can be quite large… few thousands for a small organization and potentially tens or hundreds of thousands for a medium organization.

I have already built the first part that handles anomaly detection in single metrics (similar to the gym example) and it works perfectly. As for the second HTM part I think that I’ll have to define a map that will indicate which metrics are or can be considered “similar” to which metrics and write an encoder that will maintain an overlap between them and will prevent such overlap between metrics that are not “similar” and will represent the whole thing as an SDR.

I would like very much to get your feedback on this approach and if I’m right in this approach, an example of a similar encoder in python would be very helpful and will be very much appreciated.

Thanks a lot
Yuval.

rhyolight · April 8, 2019, 12:22pm

I think it could work, but be sure to include time somehow in the 2nd stage encoding for the anomalies. If you do this in day batches, day of week may be important.

artifex · April 8, 2019, 1:50pm

Hi, thanks a lot for the quick reply. I forgot to mention it but sure time was intended to be included, in fact, I intended to include the current time on two different calendars (e.g. the gregorian date and the jewish date) do you have some sort of a good starting point for building the encoder that would be able to encode this data (the current gregorian date, the current jewish date and the anomalies pattern for the day)

Also, since that the number of metrics (i.e. the number of possible anomalies) can be quite high (several tens of thousands) how big should the SDRs that are created by this encoder be?

Thanks,
Yuval.

rhyolight · April 8, 2019, 5:34pm

This type of custom encoder would not be too hard to build. Here is an example from NuPIC. Also, I will be building out a date encoder soon for Building HTM Systems on my twitch stream starting Thursday. I still have some work to do converting the project into a proper web application before I get to that, but it will be the first thing I need to do to continue the project.

artifex · April 10, 2019, 10:33am

Thanks A LOT!
I’ll get started right away… Thanks.

Topic		Replies	Views
Anomaly detection in web server logs Getting Started	5	1101	February 11, 2019
NuPIC on satellite data NuPIC	15	974	April 23, 2019
Would HTM be good for anomaly detection in a sensor network? Getting Started anomaly-detection , question	4	1118	February 19, 2020
Inability to accumulate minor anomalies? NuPIC encoders , anomaly-detection	9	1448	November 10, 2016
GET PAID! Assistance for HTM needed! (classification + anormalies) NuPIC anomaly-detection , question , jobs , classification	2	886	September 16, 2019

Finding the predictability of a pattern of anomalies with HTM

Related topics