Datetime in anomaly detection


#1

Hello,

I am new to this forum and have a deep interest in HTM theory.

Recently, I have been looking at the anomaly detection but I didn’t get how the datetime was taken into consideration.

I thought about several possibilities:

  • The datetime is represented as a context for a temporal memory (could be a kind of apical stimulation).

    However, I doubt that because I haven’t seen any reference to apical entries in Nupic.

  • The datetime is concatenated with the data values and sent as a proximal input to the model.

    But then it would mean that the backtracking is also applied to the datetime which might impact the confidence calculus and thus get the backtracking wrong.

  • Or maybe, more simply, a reset is made when the data of an entire week have been seen.

Does anyone know anything about that? It would help a lot!

Thanks!


#2

This.

I don’t understand this statement. Can you explain?


#3

Hey @chacd,

So I think you’re closest with this one:

To summarize the details, the raw datetime value is encoded alongside the raw metric value, and these encodings are concatenated into a total encoding, which goes into the SP. This yields the SDR of 2048 columns that goes into the TM.

I’m not versed enough on bactracking to address this concern:

It seems odd that BacktrackingTM’s use of datetime is invalid if the normal TM’s use of it isn’t, tho I’m not sure.


#4

Thank you for your answer!

What I meant is that, during a day, if a pattern in the sequence is repeated, the backtracking might hardly be able to recognize it because the datetime won’t be the same.

For example, if we have:

Datetime: 0 1 2 3 4 5 6 7

Sequence: a b c d a b c d

If we do backtracking and thus burst the fifth input (the second a), it won’t be able to predict b because the first a was encoded with datetime 0, whereas here it is encoded with datetime 4 (a sub population of the active columns will not be the same). The confidence value for the next step (b, 5) might be lower than the threshold and thus the backtracking won’t find a valid sequence.

Thus, whithin a day, I think we can hardly recognize patterns of the targeted variable (the sequence) that we aim to predict.

Am I wrong somewhere? What do you think?


#5

Thank you!


#6

(Note: I’m very new here, so my answer may be completely wrong. I’m just speculating).

I see two answers to this:

  1. If the data encoding the a and b and c … is sufficiently large (sparse encoding) compared to the encoding of the datestamp, then the datestamp should not influence the pattern detection by much. And so the sequence should be detected. Perhaps at a cost, but still.

  2. If on the other hand there is some repetitive property in the datestamp itself, it will reinforce the detection of the pattern. (i.e. if a’s occur more often on weekends, that’s something you want your pattern detector to detect).


#7

I don’t think “backtracking” is doing what you think it is doing. From what I understand (and I honestly don’t understand this very well because I’m not interested in non-biological algorithms) the backtracking means holding on to definitions of recently seen sequences, outside the structure of the cells, ignoring biological plausibility, in order to better predict what sequences are currently being expressed in the data stream.