Since I need to pass min and max values in the csv for generating model parameters and also in the modelParams again I need to write min and max value of the feature I wanna track. But what if the future data coming from the data-stream has values higher or lower than described above? Will the anomaly-detection continue to handle them without breaking?
Solved part (1). There were a lot a duplicates in my training data. Once i fixed that, everything looks fine. Used Pandas function pandas.DataFrame.drop_duplicates to address this.
Does the new scalar data stream have a new min/max? You could use the min/max to identify a resolution for the RandomDistributedScalarEncoder, which is also explained in HTM School. Here is an example of how we use min/max to get a resolution:
The only required parameter is resolution, which determines the resolution of input values.
The numBuckets is used to compute the resolution in the example above because it is a little easier to reason about. @scott might have more to say here.
Now I don’t need to worry about n and w. The only parameters is resolution which greatly affects the amount of anomalies detected on my dataset. Could you suggest what are the reliable ways to determine “resolution”?