I’m trying to use NuPic to detect possible cyber attacks. The basic idea that I’ve been working on was to measure lots of metrics that measure certain “features” of the way users, computers and networks work within the organization, then send those metrics both to HTM models as well as to other algorithms that will detect anomalies in the behaviour of single metrics (mostly based on some statistical approach) and have each of these components report anomalies it detects to another service that will look at specific time windows (let’s say a day) and send to an HTM model some sort of an array that indicates which metrics have shown an anomalous behaviour during the day and its magnitude and which metrics haven’t shown any sign of anomalous behaviour and have this service detect whether this pattern of metrics anomalies during the day is in itself anomalous and unpredictable or that this pattern of anomalies is actually predictable. The thing is that the number of such metrics can be quite large… few thousands for a small organization and potentially tens or hundreds of thousands for a medium organization.
I have already built the first part that handles anomaly detection in single metrics (similar to the gym example) and it works perfectly. As for the second HTM part I think that I’ll have to define a map that will indicate which metrics are or can be considered “similar” to which metrics and write an encoder that will maintain an overlap between them and will prevent such overlap between metrics that are not “similar” and will represent the whole thing as an SDR.
I would like very much to get your feedback on this approach and if I’m right in this approach, an example of a similar encoder in python would be very helpful and will be very much appreciated.
Thanks a lot