Asymmetric confidence when predicting a symmetric signal

PhilGoddard · March 18, 2019, 7:18pm

As part of trying to understand the confidence/probability value returned along with a prediction, I have come across a curious situation that I’m wondering if anyone can help explain.

Consider the hotgym model (using the Network API, obtained from http://nupic.docs.numenta.org/stable/quick-start/network.html), but make two simple modifications to the input data,

Change the time increments to be 15 minutes
Change the data to be a (noiseless) sine wave

Hence the first few rows of the input data looks like,

After a suitable run-in period, this generates the 1-step ahead prediction (upper axis) at the specified confidence level (lower axis) shown in the image below.

I don’t understand why the confidence is consistently high for predictions that are rising (from below 0.5 towards 1) and then going over the top of the sine wave, where as the confidence for the predictions that are falling (from -0.5 towards -1) and then going around the bottom of the sine wave can be low.

Why does the CLA have more difficulty making a prediction in one region than another when the symmetry of the data would indicate (at least naively) that there is no difference between the regions?

rhyolight · March 18, 2019, 7:42pm

The sine wave is perfect periodic input, but every time it sees a cycle, the TM cannot distinguish between seeing one repetition of the pattern, or two, or three or 1000. It never knows to close the period and see that it’s just a simple pattern repeating. So the confidence will never settle.

Try reseting the TM once per period and see what the confidence looks like. (I assume the confidence is some form of anomaly likelihood?)

PhilGoddard · March 18, 2019, 8:52pm

That would make sense if the confidence was consistent (either always good or always bad or always jittery).

However I don’t see how it explains the observed behavior.

If it were true, then at the start of the pattern (from say midnight to 6am, i.e. the rise of the sine at the beginning of the day) it shouldn’t distinguish between 1, or 2 or 1000 patterns either, and the confidence in that region would be bad too. Yet it isn’t

rhyolight · March 18, 2019, 9:06pm

That is interesting, I hadn’t thought of it that way. I really don’t understand it either.

marty1885 · March 19, 2019, 4:33am

There is a solution for the problem. But it is more computationally expensive and not implemented in NuPIC.

Detailed discussion here:

PhilGoddard · March 25, 2019, 3:26pm

I’m still trying to better understand why the prediction confidence is not symmetric with the above sine-wave example.

The link given by @marty1885 is very interesting in and of itself, but it’s not clear that the sine-wave example fits into the problem described there. Specifically, the sine-wave example, since it is based on the hotgym example, does use time encoding, and as with hotgym the data is tagged with time of day and day of week.

Unfortunately, resetting the TM every period doesn’t make sense either. Although it would be possible with the sine-wave example it doesn’t for the practical problem that this would feed into.

But even if those things applied, they don’t explain why the prediction confidence would be asymmetric. (They might explain why it is good or bad, but not why it’s asymmetric.)

I’m still looking for suggestions on why the asymmetry exists.

rhyolight · March 25, 2019, 4:08pm

There are two things being encoded in your example: sine & time. There is one sine wave per time period (day). What if you increased it to be several waves per period? Or 1.5 waves per time period? I would be interested to see how the prediction confidence responded over time to the dual resolving semantics. I’m not sure if it would answer any questions, however, just curious.

Paul_Lamb · March 25, 2019, 5:47pm

The only thing that comes to my mind is perhaps the encodings for the values not being granular enough. Just intuitively, I would have expected the confidence to be lowest around the middle of the upward slope and middle of the downward slope, and for confidence to be highest near the peaks and valleys. Your graph is more what I would intuitively expect if there were a lot of overlap in the encodings between (-1, 0.5), and less overlap between those encodings and the ones between (0.5, 1). I hadn’t considered @rhyolight’s observation about time also being part of the encoding, so I’ll ponder on it some more, taking that into account.

PhilGoddard · March 26, 2019, 4:44am

Here are some images for different periods of the sine wave per day. Note that I have maintained the samples at 15min time intervals.

The first is for 1.5 full cycles per 24 hours.

The second is for 0.75 full cycles per 24 hours.

It’s tempting to still see patterns in the confidence for these images. In particular the predictions seem to have better confidence just after the peaks/troughs and worse near 0. This is consistent with there being a higher density of points near the peak/troughs than near 0.

They’re fresh off the press, so I haven’t decided how to interpret them properly yet.

rhyolight · March 26, 2019, 1:20pm

Thanks for running this experiment, Phil. How many data points have these models seen at the point where you graphed them?

It might be helpful to plot the anomaly scores and likelihoods in addition to (or instead of) your confidence value. I’m not sure what extra calculations you are doing, but the raw anomaly score can sometimes be enlightening in situations like this.

I hate to add yet another experiment parameter to this, but you might also add a jitter to the sine wave (random noise), which makes it more natural. I suspect the system might perform a bit better with added noise like this, since it forces it to generalize rather than memorize the same locked-in pattern over and over.

PhilGoddard · March 26, 2019, 2:12pm

All the images are for a model that has seen about 16000 data points, corresponding to about 160 sine wave cycles.

I was deliberately avoiding adding noise to remove randomness from the problem (although I have wondered whether some randomness would improve the learning.)

pulinagrawal · May 1, 2019, 7:50pm

Hi Matt,

As you asked, I plotted anomaly scores and tried adding jitter.

Following is a plot with anomaly scores in red. Orange is the confidence and blue is the one step prediction.

I tried to add jitter to the sine wave. About 1% input values had random noise.

Following is a plot from that. Again blue is the one step prediction and orange is the confidence.

The same trend shows up here. This also means it is not a “Repeating Inputs” problem.

Pulin Agrawal
पुलिन अग्रवाल

pulinagrawal · May 1, 2019, 8:03pm

Also, model had seen more than 18000 data points before I plotted these.

rhyolight · May 1, 2019, 8:05pm

Interesting, thanks for trying this. I was thinking more like adding a small amount of random noise (maybe +/-5%) to every single input point.

I don’t understand why the anomaly score goes up and down with the wave.

sheiser1 · May 1, 2019, 8:32pm

I thought the confidence values were the ones fluctuating, @pulinagrawal did you show the anomaly scores in your plots?

Bitking · May 1, 2019, 8:55pm

We see stuff like this when doing integer math on MPUs. I don’t know the code and am totally uninterested in digging through nested OOP crap but this problem looks strangely familiar.

Problems with signed vs absolute values come to mind.
Where this gets to be a real problem is comparisons.
Another common and related problem is shifting frames of reference during calculations. IE: picking the wrong intermediate value for further calculations.

Two other perennial problems are a) getting scaling wrong. Note that the error seems to vary with the absolute value of the sine wave. If the calculation was using this input to establish scaling of the value the error would look something like what is shown. b) This is based on a time varying value; using the “before” in one part of the calculation and the “after” in a different part can be maddenly difficult to find. This is an insidious version of the wrong frame problem.

Adding two cents that prolly do NOT apply to this problem.
Picking the wrong stride for the measurement can produce “chunky” calculations. The delta between readings is the dT and dV part of your slope calculation. If this is either aliasing with the samples taken of the raw date (unlikely in this case) or too small for the representation (integer math) can be problems.

pulinagrawal · May 2, 2019, 12:34am

Yes, confidence values are the ones that are abnormally fluctating, for no apparent reason.

Yes, the first plot shows prediction (in blue), confidence (in orange) and anomaly scores (in red). Anomaly scores is the red line across the center, mostly 0 with tiny blips here and there.

pulinagrawal · May 2, 2019, 1:28am

These are the results from a +/-5% noise applied to every input.

Bitking · May 2, 2019, 1:39am

I am assuming the the sine wave is the value between 1 and -1. Just for diagnosis try between 0 and 1.

PhilGoddard · May 2, 2019, 3:18am

Going between 0 and 1 makes no material difference to the results presented.

Topic		Replies	Views
Why the anomaly likelihood is so high for the repeated data pattern? HELP NuPIC	23	1630	August 13, 2019
Is my data being predicted correctly? NuPIC usage-help	37	6236	August 9, 2016
Help debugging the accuracy of anomaly detection on NYC_taxi data HTM.Java	1	918	November 30, 2017
Implementing the Hot Gym Example NuPIC	3	641	April 15, 2020
TemporalMemory for prediction Engineering question	35	1772	September 24, 2019

Asymmetric confidence when predicting a symmetric signal

Related topics