Using the version of the TM (with SegmentOverlaps - the one before @mrcslws’ last merge) we have identical outputs until record 45; so that is where my investigation begins today…
The testing method I used was to run the test file through the Python SpatialPooler and save its output. Then I read in that SpatialPooler output and run it through the Java TemporalMemory. This way I can guarantee identical inputs to both the Python and Java TemporalMemorys.
I’ve included GISTS (only the first 100 lines) of both the Python and Java output which shows:
- The Record #
- The raw input (timestamp, temperature)
- The GroupBy2 Column/Segment Generator output
- The calls to the UniversalRandom next() method. Next is called with the bounds in parenthesis i.e.
next(bound) <-- Even the bounds of the RNG calls are the same!
- The TM input (Sparse output of SP)
- The TM prediction
- The TM active columns (derived from the active cells)
For all of these you will see EXACT output duplication. Even the line numbers of every single output and their format, are identical (Thanks to the UniversalRandom RNG)
So now I’m investigating Record #45 to see why the Java version makes an additional call to the RNG and diverges on that record.
Right now I’m confident the Java NAB “poor results issue” has been fixed, but I just want to take it all the way and get identical output. Also, I need to run things through the Java SpatialPooler and ensure good results. Exact output using the Java SpatialPooler will have to come later because there are one or two “exotic” RNG methods being used by the Python SP. (I already equalized the “sample()” method to be the same in the UniversalRandom, but there may be another method I need to “normalize”).
It shouldn’t take me any more than a day or two to reach my goal of identical TM output, after which I’m not going to play with it anymore (after porting @mrcslws ’ latest merge) - and I’ll run the NAB to verify good results (I already see from the quality of the anomaly scores that the results will be in line with NuPIC, but like I said, I want to find the reason for the slight divergence.)