Bad results when adding date encoder

helena_Thielen · November 6, 2019, 5:31pm

Hey everybody,

I started with my analysis.
I created an artificial dataset which looks like this:

as you can see:
I have a regular pattern. I have the exact same pattern for all weekdays (Monday to Friday).
Then there is a pattern for Saturday and one for Sunday.
So basically: I have the same pattern every week over one year.

I want to do a prediction on the data.

So my first step was to run a swarm.
I ran a large swarm including all the datapoints.
Here you see the resulting model_params:

MODEL_PARAMS = \

{ 'aggregationInfo': { 'days': 0,
                   'fields': [],
                   'hours': 0,
                   'microseconds': 0,
                   'milliseconds': 0,
                   'minutes': 0,
                   'months': 0,
                   'seconds': 0,
                   'weeks': 0,
                   'years': 0},
'model': 'HTMPrediction',
'modelParams': { 'anomalyParams': { u'anomalyCacheRecords': None,
                                  u'autoDetectThreshold': None,
                                  u'autoDetectWaitRecords': None},
               'clParams': { 'alpha': 0.08751250000000001,
                             'regionName': 'SDRClassifierRegion',
                             'steps': '1',
                             'verbosity': 0},
               'inferenceType': 'TemporalMultiStep',
               'sensorParams': { 'encoders': { '_classifierInput': { 'classifierOnly': True,
                                                                     'clipInput': True,
                                                                     'fieldname': 'consumption',
                                                                     'maxval': 324.167541908,
                                                                     'minval': 37.0,
                                                                     'n': 90,
                                                                     'name': '_classifierInput',
                                                                     'type': 'ScalarEncoder',
                                                                     'w': 21},
                                               u'consumption': { 'clipInput': True,
                                                                 'fieldname': 'consumption',
                                                                 'maxval': 324.167541908,
                                                                 'minval': 37.0,
                                                                 'n': 334,
                                                                 'name': 'consumption',
                                                                 'type': 'ScalarEncoder',
                                                                 'w': 21},
                                               u'datetime_dayOfWeek': None,
                                               u'datetime_timeOfDay': None,
                                               u'datetime_weekend': None},
                                 'sensorAutoReset': None,
                                 'verbosity': 0},
               'spEnable': True,
               'spParams': { 'boostStrength': 0.0,
                             'columnCount': 2048,
                             'globalInhibition': 1,
                             'inputWidth': 0,
                             'numActiveColumnsPerInhArea': 40,
                             'potentialPct': 0.8,
                             'seed': 1956,
                             'spVerbosity': 0,
                             'spatialImp': 'cpp',
                             'synPermActiveInc': 0.05,
                             'synPermConnected': 0.1,
                             'synPermInactiveDec': 0.037687500000000006},
               'tmEnable': True,
               'tmParams': { 'activationThreshold': 14,
                             'cellsPerColumn': 32,
                             'columnCount': 2048,
                             'globalDecay': 0.0,
                             'initialPerm': 0.21,
                             'inputWidth': 2048,
                             'maxAge': 0,
                             'maxSegmentsPerCell': 128,
                             'maxSynapsesPerSegment': 32,
                             'minThreshold': 11,
                             'newSynapseCount': 20,
                             'outputType': 'normal',
                             'pamLength': 2,
                             'permanenceDec': 0.1,
                             'permanenceInc': 0.1,
                             'seed': 1960,
                             'temporalImp': 'cpp',
                             'verbosity': 0},
               'trainSPNetOnlyIfRequested': False},
'predictAheadTime': None,
'version': 1}

so my question is the following:
I KNOW that there is a dependency on the day of week as well as the time of day.
But the swarm didn’t find that.
So… I decided to tune my parameters and add the Encoding for these:
I added on the right place (rest of the parameters were the same):

u'datetime_timeOfDay': {'fieldname': 'datetime',
                        'name': 'datetime',
                        'timeOfDay': (21, 1),
                        'type': 'DateEncoder'},

afterwards I checked some metrics:
I checked the mean absolute and the root mean squared error!
And my results got worse, when adding a time of day encoder:

so without timeOfDay Encder:

MAE = 2.3064795497420802
RMSE = 5.589831939559891

With timeOfDay Encder:

MAE = 4.370068288974104
RMSE = 14.843369165358364

Then i also checked the resilts for a bunch of different parameter settings and the usage of different Encoders (like dayOfWeek ect).
The result was ALWAYS worse.

I also added some noise to my data and checked again if I can improve the prediction by tuning the parameters.
But SAME result – the more importance I gave to the date and time, the worse the prediction.

How is that possible? I know that there is a dependency on these information, but the model returns bad results when adding them.
Is there anything I do not see? Anything else I HAVE TO tune as well?

I really need some help… I am trying to tune my parameter for days… and for me all this does not make any sense.

I hope someone can help me! Thank you all very much in advance.
Best,
Helena

rhyolight · November 6, 2019, 5:48pm

I think the problem is that your data is too structured. It’s not real-world data. It is not noisy. The pattern repeats perfectly every cycle. There is no need to encode datetime in this case because the prediction will be perfect just looking at the perfect period of data. The swarm gave you back the right model with no date encoder because the data was too perfect. The TM is perfectly memorizing this one sequence over and over with no variation.

My suggestion is to find real data or add some randomness to your data, so that the swarm finds model params that more generally fit the pattern of the noisy stream. It might find that data encoding helps more depending on how random your data gets.

helena_Thielen · November 6, 2019, 6:00pm

hey =)

first of all: thanks for the super fast reply.
Well there are two things:

I also added 15% noise and the result was just the same… You think maybe 15% is still not enough randomness?
I also thought that the result should be perfect after some time.
But in fact, even after a year of training with 15min values I had some bad predictions:

the lower blue graph shows the absolute error.
So the model is even after a year not able to predict all the values perfectly.

I checked the values, when these mistakes occurred: They come always, when there is the same skalar value on a Monday or Saturday/Sunday.
So these values seem to confuse the SP/TM.
In my opinion a additional dayOfWeek Encoder would be helpful…

helena_Thielen · November 6, 2019, 6:13pm

And another thing:
Even when I have a repeating pattern:
Why does my prediction get worse? I mean: why is the result not just the same?

Is it because the important mini-columns get less chances to learn? So they need longer to learn the sequential pattern?

What I mean is: The bigger the input space and the more changing is going on in the input space, the less often one mini-column will be active. Is this right?
And if so: then every mini-column will have less times, when it is active, so less chance to learn.
So it will take more time for the TM to memorize the sequence…

Does this make sense?

rhyolight · November 6, 2019, 6:21pm

How did you add the noise? The chart you showed does not look like it has enough noise.

In this case was the data perfect or did it have some random variations?

I think you are running into the repeating inputs problem for the TM. HTM does not predict that well, that’s why we used it primarily in anomaly detection applications. Making the data more random should help, I would think. You could also manually reset the TM at a preferred temporal cycle, which will hard-code the period into the system.

sheiser1 · November 6, 2019, 9:52pm

I’d suggest making charts just like this for:

The raw anomaly scores
The counts of predictive columns

I find these are useful to expose how quickly and how precisely the model is learning.

Those spike points in your current chart do make intuitive sense tho, since they represent transitions that should take longer to learn.

One spike is when the whole sequence cycles back to the beginning, a transition which should take longest to learn since it repeats the least often. Also the other spike comes during a small stretch of volatile activity which also only appears once per total cycle.

These are parts of the pattern which should eventually be learned, but since they are less common they’ll take more repetitions.

This learning curve will be expressed more by the plots of anomaly scores and prediction counts as well.

helena_Thielen · November 7, 2019, 12:07pm

Good morning,

first of all: Thanks a lot for your answers.
So let me just clarify this:

I Have two artificial data sets. The first one is the one I shared here:

The second one, is the same just with 15% noise. So it looks like this:

I am sorry about the confusion!

##################################################################################
first: let’s talk about the data set with NO noise (figure 1 in this post)

Yessss!! I know what you mean. So in my opinion an additional information with the day of week should help the model to learn these transition, which just occur at specific days (like the spike on Fridays!).
But actually this information does not help (and I tried a bunch of different values for w and radius).

Before tuning by hand that much I ran different swarms. Some of them told me to use a timeOfDay encoder, some told me to not use it…

Question 1: How can I know when to apply a datetime encoder and when not? My intuition says: I should use it…
I am super confused with that sorry…

##################################################################################
Now: let’s talk about the other data set (figure 2 in this post, the one with 15% noise)

Because of the noise included I don’t think that I have the

btw: very interesting discussion

What I have done with this 2nd data set: I also ran 2 swarms for this dataset. One told me to use this:

 u'datetime_timeOfDay': { 'fieldname': 'datetime',
                           'name': 'datetime',
                           'timeOfDay': ( 21, 4.84385264708503),
                           'type': 'DateEncoder'},`

(I don’t think that an radius of 4.5 hours make much sense…but this was the swarm result)

The other one to use this:

u'datetime_weekend': { 'fieldname': 'datetime',
                       'name': 'datetime',
                       'type': 'DateEncoder',
                       'weekend': ( 21, 1)}

(I don’t know what the (21, 1) means here, because in the docs, there is just a single value for width mentioned, but this is copied form my model_params.
[Question 2: What means the tuple here??]

I included cases both in my model_params. First seperately, the in combination, than with tuned w and radius.
The result was always the same: My prediction was worse.

Sooo back to my question from above:

Question 3: is this the reason for worse result even with no perfectly repeating patterns?
Do my thoughts make sense?

Soo I hope that I could explain myself well this time
Thank you all very much for your help!! (I really love this Forum )
Best;
Helena

rhyolight · November 7, 2019, 9:16pm

If your intuition says you should use it, I think you should use it.

Yeah, that looks strange. Change it to simply 21. This means you’ll have a 42 bit output. (I assume the 2nd element is ignored. )

That makes sense to me, the default is 4 hours. So 1PM and 3PM would have some overlap, but 1PM and 5PM would have none. This seems about to be right fuzziness for a “time of day” to me.

See also [8] in the NuPIC Walkthrough Jupyter Notebook. If you run this locally, you can create a DateEncoder with different params and easily see the type of encodings it creates.

The repeating patterns problem is one of the reasons NuPIC doesn’t make good predictions. It is more useful as an anomaly detection system.

Zbysekz · November 8, 2019, 6:41am

@helena_Thielen i think you created the noise just by adding 15% noise to the data right?
But if that is true, the data timing is not changed, and datetimeEncoder is still not needed by TM.
For example, the time when the high peak arises is not changed, it’s value just deviates by 15%.
If TM can get the information just from the index of the data, it will.

I would suggest adding some noise also to the time axis, probably will be enough to randomly pick few points and delete them, then pick another few points and put before them new point with interpolated value of this and previous point. Just a suggestion.

This is common problem also in theory of regulation. If you use artificial data, regulation can be perfect even with added noise, but the real data, that’s another story.

rhyolight · November 8, 2019, 5:15pm

This a very good point. HTM works better on real data than fake data. Do you have a real data set you intend to use HTM on?

helena_Thielen · November 14, 2019, 9:42am

I see… that makes totally sense to me. Thank you very much!!!

Yes! I do… The plan was to see the performance of HTM compared with NN on artificial datasets first.
But I will move on to my real data and see what happens.

Thank you all sooooo much for you help

rhyolight · November 14, 2019, 4:39pm

I hope you can share the structure of your data. A big part of the problem is understanding the data, especially fields and aggregations that help predictions. I always try to plot the data and find patterns with my own human brain first. If I can see patterns, the HTM should be able to see them if tuned properly.

helena_Thielen · December 18, 2019, 1:43pm

Hey again,

I was just checking this topic again and realized, that I didn’t understood that completely:

@sheiser1 : What do you mean with the second point? And maybe you have a hint how to get this information.

I would really appreciate your help.
Best,
Helena

sheiser1 · December 18, 2019, 8:46pm

Hey @helena_Thielen,

So the TM can make different amounts of predictions at any give time. If it’s seeing new patterns or noise it will predict nothing, and if its in a highly familiar and specific sequence it will make few prediction (or just 1). Whereas if its in a vaguely familiar sequence it may make many predictions.

For a simple example, lets say if its trained on the sequence:
A,B,C,D,X,B,C,Y,A,B,C,D,X,B,C,Y…

If you feed this TM with just ‘B,C’, it will predict both “D” and “Y”, since it doesn’t have enough content to rule out either one. However if you feed in “A,B,C” it will only predict “D”. In this case it can predict more precisely than the first case.

So in the first case, the number of predicted columns would be ~80 (~40 for each “D” and “Y”). However the number of predicted columns in the second case would be only ~40, since its only predicting “D”. This difference between 40 and 80 predicted columns is what I mean by precision.

In my code I have this to extract that info from the TM:

TM_Cols_CellsPredStates = TM_obj.getPredictedState()
TM_ColsIndices_withPredCells = [i for i in range(len(TM_Cols_CellsPredStates)) if
                                sum(TM_Cols_CellsPredStates[i]) > 0]
winnerCellsCount = int(TM_obj.numberOfCols * .02)
predCount = len(TM_ColsIndices_withPredCells) / float(winnerCellsCount)

helena_Thielen · December 19, 2019, 1:59pm

niiice =)
This is a really nice information.
Thank you very much!

Will try that =)

Topic		Replies	Views
Is my data being predicted correctly? NuPIC usage-help	37	6236	August 9, 2016
Force the usage of a field by swarm Education	18	657	February 4, 2020
Don't swarm for Anomaly models NuPIC swarming , anomaly-detection	16	5314	October 17, 2019
Int Assertion Error NuPIC question	41	2647	September 21, 2016
Swarming Nupic Data NuPIC question	7	619	January 2, 2019

Bad results when adding date encoder

Related topics