Anomaly detection on Time Series Data with multiple catogory

srivatsa96 · March 12, 2018, 2:49pm

I am building a system which consists of three different columns, one is timestamp, value and a category. I want to know will HTM Anomaly Detection system work if I simply create Multi Encoder with all three columns or I will have to separate the time series as per category before feeding? There can be potentially categories of order ~10^4

rhyolight · March 12, 2018, 2:53pm

Anomaly detection should work either way. See details of anomaly score calculation in API docs for more.

Having so many categories might be a problem. At a minimum you’ll need 10⁴ bits in the input space, and I don’t like having only one bit represent a discrete data point. I would rather spread it out so several bits are used. Is there any way to generalize over the categories to narrow them down?

srivatsa96 · March 12, 2018, 3:16pm

No each of them is independent of other and with time(over years) they will keep on increasing also.

rhyolight · March 12, 2018, 3:51pm

Are some of these categories more important to monitor than others?

srivatsa96 · March 12, 2018, 4:09pm

Those categories are actually different applications on IT Infrastructure. We offer user drop down to select the application so all are equally important, moreover anomaly in any one of the system will trigger alert.

Also in single time multiple application may be active to give some value. Rather than just timestamp be unique, it <timestamp,category> which is unique. Do you think this use case can be handled using HTM

rhyolight · March 12, 2018, 4:14pm

I wonder if pure volumes would be a pattern. Are there common values that can be accumulated across categories? Perhaps they have patterns.

rhyolight · March 12, 2018, 4:16pm

If you could provide more information for each “category”. What I mean is a better way to encode that data than each one being unique. There must be some similarities between categories, right?

srivatsa96 · March 12, 2018, 4:42pm

The categories are various applications running in some server. We are designing system to monitor various parameter related to those application. An application may be added in time or may be decommissioned. I am using flink-htm implementation as the data was being continuously stream.
Although as far as my understanding is all applications are for unique operations and there isn’t any pattern in them. All though they run of limited hosts and one application may be running on more than one host

rhyolight · March 12, 2018, 4:46pm

It sounds like if you were to use HTM for this problem, you would need one HTM model for each category, which is likely not possible. I’m not sure this is a good fit until we have better HTM hardware options.

srivatsa96 · March 12, 2018, 4:53pm

Yes, this was the limitation which i was facing when i thought of encoding the category also. Anyway I will see if there is some similarity in the categories. Will get back in a day or so

Topic		Replies	Views
Anamoly detection with HTM NuPIC	2	819	January 22, 2018
How to approach anomaly detection in htm.core with multivariate data NuPIC Community Fork question	6	652	June 26, 2021
Time Series with Categories NuPIC classification , category-encoding	3	1008	June 21, 2021
Flink Anomaly Detection HTM.Java	1	1747	July 24, 2017
Would HTM be good for anomaly detection in a sensor network? Getting Started anomaly-detection , question	4	1118	February 19, 2020

Anomaly detection on Time Series Data with multiple catogory

Related topics