Can you try to replicate this and give us a test? Problem is you’re probably using python OPF framework, which requires nupic.bindings from nupic.core, and is not compatible with our bindings from nupic.cpp.
@PhilGoddard Assuming this is the code we’re talking about, can you get access to some of these values during a compute cycle where predictionConfidence is NaN?
Does classifierRegion.getOutputData("probabilities") or classifierRegion.getOutputData("actualValues") look any different in structure?
def getPredictionResults(network, clRegionName):
"""Get prediction results for all prediction steps."""
classifierRegion = network.regions[clRegionName]
actualValues = classifierRegion.getOutputData("actualValues")
probabilities = classifierRegion.getOutputData("probabilities")
steps = classifierRegion.getSelf().stepsList
N = classifierRegion.getSelf().maxCategoryCount
results = {step: {} for step in steps}
for i in range(len(steps)):
# stepProbabilities are probabilities for this prediction step only.
stepProbabilities = probabilities[i * N:(i + 1) * N - 1]
mostLikelyCategoryIdx = stepProbabilities.argmax()
predictedValue = actualValues[mostLikelyCategoryIdx]
predictionConfidence = stepProbabilities[mostLikelyCategoryIdx]
results[steps[i]]["predictedValue"] = predictedValue
results[steps[i]]["predictionConfidence"] = predictionConfidence
return results
I will try to replicate and work on this issue in my live-stream tomorrow. I have a bunch of other community NuPIC issues to work on.
Not sure what you mean by “different in structure” – they are the expected size and data type.
Superficially the problem is that probabilities has at least one NaN value in it, and argmax is returning the index of that NaN, so predictedConfidence is NaN.
An initial thought was to get the index of the max probability ignoring NaN values. But that just masks the problem - it doesn’t fix it, and doesn’t explain why there is a NaN in the first place.
QQ: are you still sending in timestamps in the extended data? Are they duplicates of existing data the model has seen? Or do they continue forward in time logically?
Phil, I tried debugging this today for awhile (video coming soon). It looks like a bug in the C++ SDRClassifier. If you change your implementation to py, does it work?
Can you enunciate on what the error is in the cpp implementation?
I guess I’d always assumed that the py implementation was really just a wrapper around the cpp implementation, rather than being a completely separate implementation.
I wish I knew. I had trouble debugging my python 2 / C++ build envioronment this morning and ended up pip install nupic so I could not debug into the C.
Yeah me too. The only reason we’d do this in C is for speed. Strangely, this bug depends not only on how much data as been seen by the model, but also on the prediction step value. I have only see this happen on 5-step predictions (not 1, not 4, not 50). I’m a bit flummoxed by this. The problem I suspect is in the weight matrix somewhere evaluating to 0 in some calculation and causing a divide/0 errror. These matrices are summed in order to get the output. If any of them are nan, it could cause this.
Out of curiosity, how is HTM performing at ~20 steps ahead prediction? As that sounds pretty impressive.
OT, as I recall, all this >1 steps ahead is a hack, because natively HTM predicts only the next step. So I imagine something like for i in steps: pred = HTM(pred)
I cannot be sure yet because I have not been able to give enough time to cpp implementation, but this could possibly be related to the issue. If some one gets to debugging this in cpp, this may be something to look at.