Hot Gym 96 step ahead prediction: AssertionError

Hi everyone,

This is my first time experimenting with nupic after a friend recommended it to me. I work on doing load forecasting for buildings so I though I would try to build on the hot gym example!

I am interested in predicting more than just one step ahead and work with 15min interval data. I changed the code in a few places to make predictions 2h (8 steps) ahead.
In swarm_description.py
"predictionSteps": [1, 2, 3, 4, 5, 6, 7, 8]
the appropriate "maxValue", "minValue"
and then in run.py and nupic_output.py as suggested on this forum.

Everything worked fine with a medium and large swarm size so I decided to move on to my original goal of 1day ahead prediction (so 96 steps). As expected it takes quite some time to run (15min for a small swarm) and when I try a medium swarm every-time the swarming fails after a while on:

722 <jobID: 1008> 9  models finished [success: 8; EOF: 8; stopped: 0; killed: 0; error: 0; orphaned: 1; unknown: 0]
723 Evaluated 10 models
724 HyperSearch finished!
725 Worker completion message: None
726 
727 Results from all experiments:
728 ----------------------------------------------------------------
729 Generating experiment files in directory: /tmp/tmpjP8xrz...
730 Writing  312 lines...
731 Writing  114 lines...
732 done.
733 None
...
774 [8] Experiment _NupicModelInfo(jobID=1008, modelID=1516, status=completed, completionReason=eof, updateCounter=120, numRecords=11529)
775 (modelParams|sensorParams|encoders|timestamp_weekend:radius_1.modelParams|clParams|alpha_0.0001.modelParams|tpParams|minThreshold_9.modelPar    ams|tpParams|activationThreshold_12.modelParams|tpParams|pamLength_1.modelParams|sensorParams|encoders|_classifierInput|n_28.modelParams|spP    arams|synPermInactiveDec_0.1):
776   multiStepBestPredictions:multiStep:errorMetric='aae':steps=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,     23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57,     58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92,     93, 94, 95, 96]:window=1000:field=kw_energy_consumption:      153407.1894843198
777   multiStepBestPredictions:multiStep:errorMetric='altMAPE':steps=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,     22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56,     57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91,     92, 93, 94, 95, 96]:window=1000:field=kw_energy_consumption:  41.4741570673682
778 
779 Traceback (most recent call last):
780   File "swarm.py", line 109, in <module>
781     swarm(INPUT_FILE)
782   File "swarm.py", line 101, in swarm
783     modelParams = swarmForBestModelParams(SWARM_DESCRIPTION, name)
784   File "swarm.py", line 78, in swarmForBestModelParams
785     verbosity=0
786   File "/home/rschucker/anaconda2/lib/python2.7/site-packages/nupic/swarming/permutations_runner.py", line 271, in runWithConfig
787     return _runAction(runOptions)
788   File "/home/rschucker/anaconda2/lib/python2.7/site-packages/nupic/swarming/permutations_runner.py", line 212, in _runAction
789     returnValue = _runHyperSearch(runOptions)
790   File "/home/rschucker/anaconda2/lib/python2.7/site-packages/nupic/swarming/permutations_runner.py", line 155, in _runHyperSearch
791     metricsKeys=search.getDiscoveredMetricsKeys())
792   File "/home/rschucker/anaconda2/lib/python2.7/site-packages/nupic/swarming/permutations_runner.py", line 875, in generateReport
793     if modelInfo.getCompletionReason().isEOF():
794   File "/home/rschucker/anaconda2/lib/python2.7/site-packages/nupic/swarming/permutations_runner.py", line 2316, in getCompletionReason
795     assert self.isFinished(), "Too early to tell: %s" % self
796 AssertionError: Too early to tell: _NupicModelInfo(jobID=1008, modelID=1517, status=running, completionReason=None, updateCounter=114, numRe    cords=11300)

I tried with different amount of maxWorkers in swarm.py but that doesn’t help.
Any suggestions on what I can do? Should I start a github issue for this?

model_params are still being generated so I thought I would try them, but obviously the results are not really good…

In red is some other kernel regression method I developped

1 Like

Try just "predictionSteps": [8] and see if the swarm runs.

The swarm runs fine both with "predictionSteps": [8] or "predictionSteps": [1, 2, 3, 4, 5, 6, 7, 8] it is with "predictionSteps": [1, 2, ... , 96] that it runs into problems (medium swarm)… but trying "predictionSteps": [96] now !

What is the difference between [96] and [1, 2, 3,…, 96] ? For my application I am interested in predicting the whole 96-point day ahead vector, wouldn’t [96] only predict the point that is 96 steps ahead and not the others?

Yes that’s correct, but understand that this is probably going to take a very long time. I don’t think we’ve ever tried to predict so many steps ahead before.

1 Like

Yes I expect it to take some time! I have already setup a cluster to let it run over the weekend! I am very curious to see how HTM performs for longer steps ahead prediction. As expected, with "predictionSteps": [96] the swarm runs smoothly. I am changed the model_params manually to force a 96-vector output and also increase n as suggested here:

To go back to the original problem, it looks like during swarming it is trying to access the results of a model that is still running, any ideas how to alleviate that? I am trying to reduce the number of rows to swarm over but not sure if that will work…

Yes, [96] only predicts 96 points into the future and not the others. There is one major computational implication. We use a classifier to map internal HTM sparse states to predictions that are k steps ahead. When you do “predictionSteps”: [1, 2, … , 96] it will actually instantiate 96 separate classifiers. Each classifier can consume some non-trivial memory and CPU, so 96 of them is going be quite heavy. Unless you really need all the interim predictions, I would recommend sticking with [96].

Note that it should not take much longer to do 96 step prediction than one step prediction. It might need a bit more data. You don’t need to swarm over all the rows - you could just use 1500 rows or so and then use the resulting model parameters to run the entire dataset.

2 Likes