Assuming it is working correctly I would agree.
Unless there is a bug in the software.
One of the possible bugs is issues with minus values.
I think you’ve misunderstood my statement. I’m not suggesting that there should be no material difference, I’m stating that the above examples have all been executed with the sinewave going from 1 to 1, and going from 0 to 1, and that there is no material difference between the results.
I did not see where that was tested in the graphs posted above.
It seems Phil is correct. I found no significant difference between the results in going from 0 to 1.
Following graph demonstrates that
That’s odd, I see that the confidence values are visually much higher.
No dips below 20% vs 13 in the bipolar example, and 6 below 40%, vs 39 below 40% in the 1 to 1 version.
How can you say that this is the same?
If this is in percent then the values should be the same no matter what the magnitude.
I am just guessing that if you went between a smaller range (say 0 to 0.1) the confidence spread would be even less.
I don’t think that this is calculating what it is supposed to be calculating.
Let’s clear up terms just to make sure we are all talking about the same things. Here are the typical terms I usually use:
 “anomaly score” comes out of NuPIC as a raw value 0<1
 “anomaly likelihood” is the postprocess we run to get a value that is easy to threshold, which is used to create anomaly flags
Questions:
 What is “confidence”?
 How are you getting the
1
flags in your red line? Are you using anomaly likelihood?  when you changed your input range from 1<1 to 0<1, did you properly update the encoder min/max?
It is the “predictionConfidence” of the prediction (per Network API — NuPIC 1.0.5 documentation)
Those are “anomalyScores” directly from the TMRegion. I don’t think it is “anomaly likelihood”, because I did not do any postprocessing .
Not for these graphs, because I like to do minimal amount of changes while debugging. But I did try it after changing min/max. It does not make any significant difference. (‘significance’ as implied by the pertinent question here, as quoted below)
In any event, I feel I know where this problem is coming from. Will post my observations from debugging in a following post to this thread.
So I observed from some debugging that the actualValues
(and possibly probabilities
), obtained as a consequence of running the code at http://nupic.docs.numenta.org/1.0.5/quickstart/network.html#gettingpredictions (quoted below), do not produce correct values.
def getPredictionResults(network, clRegionName):
"""Helper function to extract results for all prediction steps."""
classifierRegion = network.regions[clRegionName]
actualValues = classifierRegion.getOutputData("actualValues")
probabilities = classifierRegion.getOutputData("probabilities")
steps = classifierRegion.getSelf().stepsList
N = classifierRegion.getSelf().maxCategoryCount
results = {step: {} for step in steps}
for i in range(len(steps)):
# stepProbabilities are probabilities for this prediction step only.
stepProbabilities = probabilities[i * N:(i + 1) * N  1]
mostLikelyCategoryIdx = stepProbabilities.argmax()
predictedValue = actualValues[mostLikelyCategoryIdx]
predictionConfidence = stepProbabilities[mostLikelyCategoryIdx]
results[steps[i]]["predictedValue"] = predictedValue
results[steps[i]]["predictionConfidence"] = predictionConfidence
return results
actualValues
is supposed to be a map from bucket index to an estimate of the actual input value that encoder will see in that bucket and probabilities
is supposed to be a map from bucket index to probability.
Let me show you the output actualValues
and probabilities
for first couple of values.
first input is 0.5 (corresponds to a bucket index of ‘89’)
second input is 0.53270155 (corresponding to a bucket index of ‘94’)
A more detailed view of actualValues
shows that all the values till index 88 are 0.53270155 and 89th value is 0.5 and rest are 0.0. Similarly, probabilities
are all 0.011111 till index 89 and rest are 0.0.
It seems that any bucket that is not yet seen is filled by the most recent value in actualValues
. I am not sure about the exact calculations of probabilities but if the same kind of problem is going on with probabilities
then it will cause problems with the “predictionConfidence”
I think this value is the prediction confidence of the SDR Classifier, not related to the anomaly score. I don’t think this is something you should consider plotting. Perhaps we are all looking at the wrong metric? I thought we were talking about the raw anomaly score going up and done with the signal.
But anomaly scores range widely between 0 and 1. The red line seems to plot only 0 and 1. This does not make sense to me.
Ok, but the HTM is not getting encoding the input in the range you expect, so I would definitely fix this.
As for your next post, I am not sure this is relevant because you are still comparing to the SDR Classifier’s prediction confidence, not raw anomaly scores.
To Summarize
Don’t use the prediction confidence of the SDR Classifier as an anomaly indication. Use the raw anomaly scores, or better yet, use the anomaly likelihood and a threshold to flag anomalies.
Matt, but Phil’s question was about the inconsistency in predictionConfidence. We are not interested in anomaly likelihood or anomaly scores.
Correct. My point is that this prediction confidence is not indicating the performance of the HTM so much as the performance of the SDR Classifier.
Ok. Thank you for the link to the twitch stream. It helped in asking the right question here.
So, it seems that you do not want us to look at predictionConfidence, but at only anomaly scores. I thought predictionConfidence is important because if we want to use predictions coming out of HTM, which is done using an SDR Classifier, then a measure of confidence about those prediction is also important. Do you think anomaly scores can replace prediction confidence in that sense?
On a side note, https://www.youtube.com/watch?v=QZBtaP_gcn0 this video explains how prediction confidence is calculated by SDR Classifier.
It is very important if you want to convert the TM’s output into predictions in the input language of the encoder (in this case a number representing the sine wave output). I am saying a closer reflection of the uncertainty of the HTM system can be monitored with the anomaly score, a direct representation of the TM’s “uncertainty”. I’ll discuss this with @subutai when he gets back into the office to see if he has any corrections to my advice here.
Just thinking, could a fair ‘confidence’ measure for classifierforecasted simply be the number of predictive cells / 40? I say 40 because when the TM made a single precise prediction then I think there’d be ~40 predictive cells (one per activated SP column).
If the TM is predicting many different potential values for the next input then it is less ‘confident’ in precisely what the value will be right? I need to check how exactly the current ‘confidence’ is calculated, this just seemed a viable potential alternative.
I think this strategy would result in the average confidence getting lower over time for a repeating sequence like this one. Every time the loop reaches the end of what has been learned, you’d get a burst, putting every iteration of the pattern into predictive state for each subsequent time step (causing the confidence score to sharply tank then slowly recover until just before the next burst).
Right, and would that not be an accurate measure of the confidence in that case? If the TM has the hiccup where it bursts every so often on repeating sequences then that burst time step should have low confidence right?
Yes. I agree that the “repeating inputs” problem results in low confidence over time, and this would be an accurate measure of that uncertainty. Just pointing out that this wouldn’t be a reflection of uncertainty in what the next predicted input will be, but rather what context it will be in. Subtle difference, but it means that one could reliably predict what the next input will be even though the confidence about its context is low (“I know I’m heading up toward the peak, but is this the 3rd time through the sine wave, or the 93rd?”).
Two points:

With hierarchy, I would expect a higher level to feedback a context framing the sequence. In practice, this might look like the HTM time input encoder with the current input time fields staying the same and a new “context field” feedback from a higher level.

Do we have some HTM way of establishing a circular/repetitive task with online learning? This should be an important research line as this is a common task in the brain. The “central pattern generator” comes to mind. Granted  this is a lower brain function but I would not think it out of line to add a mechanism to do this with canonical HTM.
This is what I get when I plot all the previous things with ‘py’ implementation of SDRClassifier as you suggested @rhyolight to @PhilGoddard in SDRClassifier returning confidence of NaN. This might be a related issue.
In summary, with ‘py’ implementation predictionConfidence behaves as expected, i.e having generally 100% confidence in predicting a very predictable signal.
That’s very interesting.