How to encode the ordinal numbers instead of date in time series

jawart · September 3, 2021, 12:19pm

Hello All,
A hotgym example shows how to encode date + scalar values. I have a time serie s where there are ordinal numbers instead of dates. How can I to encode that? To use scalar encoders twice? Can anyone help?

David_Keeney · September 3, 2021, 1:27pm

The DateEncoder used in the hotgym example is actually just a set of non-overlapping scalar encoders which encode parts of the date-time. Their SDR’s are concatinated together into a single SDR. It would be safe to replace the DateEncoder with a single scalar encoder if you have an ordinal value to represent an evenly spaced duration of time. You could use either the ScalarEncoder object or the RDSE object. You would only need one.

Doing this will not give exactly the same results because it will be more difficult to identify a “weekend” or the end of a day. But…try it out and see what you get.

sheiser1 · September 3, 2021, 6:04pm

Do these ordinal number repeat or just increase monotonically? If they don’t repeat I’d just drop the column and use raw values only

jawart · September 7, 2021, 9:26am

Thank you for quick response.
Still using hotgym as an example - having two columns (date & values) we have such concatenation:
encoding = SDR( encodingWidth ).concatenate([consumptionBits, dateBits])
But having the first column as the values increasing +=1 and after dropping it out, will this code line look like that:
encoding = consumptionBits
Right?

sheiser1 · September 7, 2021, 3:23pm

Yep!

Any field (‘feature’ in common ML speak) that doesn’t have periodicity over time is basically just adding noise to the system.

In the hotgym data set the date column does have periodicity tho, through days of the week and hours of the day. I think this helps the system pick up on the pattern of demand shift from weekday to weekend.

The system will naturally learn shorter term patterns faster because they become apparent faster (in hotgym it’s the daily pattern that plays outs over hours). The weekday to weekend shift pattern tho plays out over days and repeats weekly, so the periodicity takes much longer to show itself.

jawart · September 9, 2021, 10:14am

Before using this algorithm with my data I tried to do some experiments with hotgym example.

Hotgym without any changes shows the anomaly score on the chart as repeatable peaks (ca. 25 pcs) and very low values inbetween - mostly lower than 0.05.
After replace the time with ordinal numbers - two columns as oridinal numbers and consumption values I got the anomaly score very low everywhere (after first ca. 400 values as learning (?)) - it looks like very low “comb” in range 0 - 0.1, placed under the red input values.
After removing the first column, with one column only, containing consumption values I have got everywhere irregular but very high values of anomaly score in the range of 0.6 - 1.
How to explain such big differences between 2nd and 3rd cases?

BTW - is there a possibility to show that here?

sheiser1 · September 9, 2021, 7:58pm

It must be because the encodings are very different in the 2 cases. In case 2 there’s a 2nd column of just ordinal values, and in case 3 there’s only 1 column of raw values. In case 1 (‘Hotgym without any changes’), the 2nd column is timestamp, which is perfectly periodic – adding to the predictability of the total encoding and causing drop in anomaly scores relative to case 3 (where timestamp column is not included)

Topic		Replies	Views
Qustion about periodic scalar encoder Getting Started	4	603	October 19, 2020
HTM School Episode 6: Datetime Encoding YouTube encoders	3	1457	June 27, 2016
Question on swarm results in hotgym example NuPIC	1	326	February 19, 2019
Why can't I encode time in very small increments with DateEncoder? NuPIC encoders , question , data	9	1790	October 3, 2016
Can you check my scalar & date encoders? Getting Started	0	487	October 15, 2020

How to encode the ordinal numbers instead of date in time series

Related topics