Asymmetric confidence when predicting a symmetric signal

#1

As part of trying to understand the confidence/probability value returned along with a prediction, I have come across a curious situation that I’m wondering if anyone can help explain.

Consider the hotgym model (using the Network API, obtained from http://nupic.docs.numenta.org/stable/quick-start/network.html), but make two simple modifications to the input data,

  1. Change the time increments to be 15 minutes
  2. Change the data to be a (noiseless) sine wave

Hence the first few rows of the input data looks like,

After a suitable run-in period, this generates the 1-step ahead prediction (upper axis) at the specified confidence level (lower axis) shown in the image below.

I don’t understand why the confidence is consistently high for predictions that are rising (from below 0.5 towards 1) and then going over the top of the sine wave, where as the confidence for the predictions that are falling (from -0.5 towards -1) and then going around the bottom of the sine wave can be low.

Why does the CLA have more difficulty making a prediction in one region than another when the symmetry of the data would indicate (at least naively) that there is no difference between the regions?

2 Likes

#2

The sine wave is perfect periodic input, but every time it sees a cycle, the TM cannot distinguish between seeing one repetition of the pattern, or two, or three or 1000. It never knows to close the period and see that it’s just a simple pattern repeating. So the confidence will never settle.

Try reseting the TM once per period and see what the confidence looks like. (I assume the confidence is some form of anomaly likelihood?)

1 Like

#3

That would make sense if the confidence was consistent (either always good or always bad or always jittery).

However I don’t see how it explains the observed behavior.

If it were true, then at the start of the pattern (from say midnight to 6am, i.e. the rise of the sine at the beginning of the day) it shouldn’t distinguish between 1, or 2 or 1000 patterns either, and the confidence in that region would be bad too. Yet it isn’t

1 Like

#4

That is interesting, I hadn’t thought of it that way. I really don’t understand it either.

0 Likes

#5

There is a solution for the problem. But it is more computationally expensive and not implemented in NuPIC.

Detailed discussion here:

1 Like

#6

I’m still trying to better understand why the prediction confidence is not symmetric with the above sine-wave example.

The link given by @marty1885 is very interesting in and of itself, but it’s not clear that the sine-wave example fits into the problem described there. Specifically, the sine-wave example, since it is based on the hotgym example, does use time encoding, and as with hotgym the data is tagged with time of day and day of week.

Unfortunately, resetting the TM every period doesn’t make sense either. Although it would be possible with the sine-wave example it doesn’t for the practical problem that this would feed into.

But even if those things applied, they don’t explain why the prediction confidence would be asymmetric. (They might explain why it is good or bad, but not why it’s asymmetric.)

I’m still looking for suggestions on why the asymmetry exists.

2 Likes

#7

There are two things being encoded in your example: sine & time. There is one sine wave per time period (day). What if you increased it to be several waves per period? Or 1.5 waves per time period? I would be interested to see how the prediction confidence responded over time to the dual resolving semantics. I’m not sure if it would answer any questions, however, just curious.

1 Like

#8

The only thing that comes to my mind is perhaps the encodings for the values not being granular enough. Just intuitively, I would have expected the confidence to be lowest around the middle of the upward slope and middle of the downward slope, and for confidence to be highest near the peaks and valleys. Your graph is more what I would intuitively expect if there were a lot of overlap in the encodings between (-1, 0.5), and less overlap between those encodings and the ones between (0.5, 1). I hadn’t considered @rhyolight’s observation about time also being part of the encoding, so I’ll ponder on it some more, taking that into account.

3 Likes

#9

Here are some images for different periods of the sine wave per day. Note that I have maintained the samples at 15min time intervals.

The first is for 1.5 full cycles per 24 hours.

The second is for 0.75 full cycles per 24 hours.

It’s tempting to still see patterns in the confidence for these images. In particular the predictions seem to have better confidence just after the peaks/troughs and worse near 0. This is consistent with there being a higher density of points near the peak/troughs than near 0.

They’re fresh off the press, so I haven’t decided how to interpret them properly yet.

4 Likes

Matt is live streaming on Twitch regularly
#10

Thanks for running this experiment, Phil. How many data points have these models seen at the point where you graphed them?

It might be helpful to plot the anomaly scores and likelihoods in addition to (or instead of) your confidence value. I’m not sure what extra calculations you are doing, but the raw anomaly score can sometimes be enlightening in situations like this.

I hate to add yet another experiment parameter to this, but you might also add a jitter to the sine wave (random noise), which makes it more natural. I suspect the system might perform a bit better with added noise like this, since it forces it to generalize rather than memorize the same locked-in pattern over and over.

0 Likes

#11

All the images are for a model that has seen about 16000 data points, corresponding to about 160 sine wave cycles.

I was deliberately avoiding adding noise to remove randomness from the problem (although I have wondered whether some randomness would improve the learning.)

0 Likes