I am doing anamoly detection on server log data in real time stream…One of the feature does not have much large variation lyk mostly the range is from 4.00 to 4.05 …What is the best encoding for such a feature in which the range is small …
What should be the value of n and w such that when the value has small change then no anamoly is detected but when large change happens like 4 to 2 or 4 to 5 then it should detect anamoly
You could try a ScalarEncoder with ‘minval’=4.0 and ‘maxval’=4.05 with ‘clipInput’=True. This will set the input range in that tight window and treat all values below 4.0 as 4.0 and all above 4.05 as 4.05. There’s not an exact science to the w/n ration, you could try ‘w’=21 and n=231. I think the encoder requires that n >= 6*w.
I’d also note that the system won’t necessarily raise an anomaly due to a very high or low value outside the range, like 2 or 5 as you mention. It will likely cause the Anomaly Score at that time step to be 1.0, though the Anomaly Likelihood value will also depend on the distribution of Anomaly Score values in the recent past.
But i read that for detecting anomaly we should always use anomalyLikelihood param.Please tell me how does anomalyScore help us …in order to detect something as anomaly i have set the anomalyLikelihood threshold to be greater than 0.9999 …How does Anomaly Score play a role in detecting anomaly