Load/Save model may lose precision?

rainyyun · August 23, 2016, 6:28pm

Hi I am playing with the one hotgym opf anomaly example in nupic. For every 100 data points, I save the model and then next time I load the model to continue the computation. I observed the prediction results are quite different at some point, comparing to running the entire dataset in memory. Then I also tried for every 600 data points, save and load the model. the results start to diverge at much later data points. I was wondering if load/save can cause the precision loss?

rhyolight · August 24, 2016, 5:50pm

This certainly should not happen. Can you provide some more evidence of this behavior? Like predictions from a normal run against your data vs. a run where you serialize and resurrect your model in the middle of it.

rainyyun · August 29, 2016, 4:25am

Yes, one can easily reproduce my results. Just put something like the following in the the loop to save and load the model at every x data points

if (counter % x == 0):
_ print “Read %i lines…” % counter_
_ model.save(my_path)_
_ model = model.load(my_path)_

you can try different x values to see how the results change. I am attaching results for every 100 and 600 data points (https://drive.google.com/open?id=0B4TNsSMedgSoZHgxVjZsWURIMFE), and if x >= 3945, the results will be the same as running the entire data set in memory.

vkruglikov · August 29, 2016, 5:36am

Indeed, see the disabled tests https://github.com/numenta/nupic/blob/64400aa71982adb8069ec595ecbd4b9950d23183/tests/integration/nupic/opf/opf_checkpoint_test/opf_checkpoint_test.py#L461-L483.

@rhyolight, both of the above-mentioned disabled tests reference the issue NUP-1864 in Numenta’s JIRA. NUP-1864 is closed for some reason, but should be opened. Also, check with @subutai - I think he may have had an explanation about this discrepancy.

rhyolight · August 29, 2016, 3:24pm

Thanks, @vkruglikov and @rainyyun for reporting. I’ve created a nupic.core issue to cover this problem on the OS tracker.

subutai · August 29, 2016, 6:52pm

I believe this is due to the fact that the current serialization converts floating point numbers to strings and back again. Converting to string leads to a slight loss of precision so results can diverge, though qualitatively there should be little effect on accuracy.

With the new Capn Proto based serialization, this issue should go away, but it would be good to verify.

Topic		Replies	Views
Questions about using saved models NuPIC	11	2247	December 10, 2018
Serializing an anomaly detection model NuPIC	6	731	May 14, 2018
Anomality score / likehood after model load NuPIC anomaly-detection , serialization , opf	3	819	March 15, 2017
Save/load the model in cloud storage NuPIC question , serialization	13	2009	August 4, 2016
Nupic Anomoly Detection NuPIC question	1	481	June 7, 2019

Load/Save model may lose precision?

Related topics