Help debugging HTM.java for anomaly detection

@alavin

Thank you for keeping up on things! Well, I’ve (prepared/am finishing up) a test harness which I can have run any combination of algorithms in a chain (Python and Java matching versions). The current version of that confirms that after 50 or so calls to the UniversalRandom during TM intialization (using the most recent re-written TM and not the final one), and processing of one row, produces the same output and the same 50 or so random integers when running both the Python and Java version.

I’m waiting on https://github.com/numenta/nupic/pull/3254 to be merged, and then I’m going to re-write the Java version to do the same thing. After that, I’ll begin testing everything (very very minutely) in earnest and I expect to find what divergence exists (if any does after the TM re-write). Also I’ve created another branch to do an SP update of the Java version to match the Python SP code - but that will be done after making sure the Java version gets good Anomaly results - then I will create a new PR for a new SP.

@cogmission, https://github.com/numenta/nupic/pull/3254 was just merged.

2 Likes

@amalta Good Job!
I know! I’m so excited! Ok, well I’ve got to update my fork and get to work! :wink:

@rhyolight
I update my fork, then run:
python setup.py develop --user
right?

Yeah that should work. Not sure if you care, but if you want to see the compatibility test pass, you would have to update your nupic.core fork then compile/install the bindings.

Thanks, but I think I’ll wait to update nupic.core, it required major surgery last time - I’m sure I’ll need @rhyolight’s assistance for that…

1 Like

oh god please no

3 Likes

@alavin
Status Update:

After studying the itertools.groupby() and the new groupby2() code, and dissecting it’s output at various stages; I’ve created a Java equivalent: A GroupBy() class that is used by another GroupBy2() class and has the same algorithmic treatment and output as confirmed by the two new tests, GroupByTest(); and GroupBy2Test() which perform the same tests and yield the similar result data containers with the exact same output as their Python counterparts. The two new classes can be used in the same way in the same call site within the new TemporalMemory (not as yet written).

Today I’m starting in on the new TemporalMemory and TemporalMemoryTest; I expect to be done around Monday. After that, I expect to return to debugging the NAB.

@mrcslws @amalta
Just an fyi point. I altered the exit condition of the GroupBy2 to be triggered by a custom PeekIterator that can query the Generator to see if there’s a match for the new/next key - instead of letting the code throw an exception (and executing a controlled crash), to indicate generator exhaustion. I’m not sure how the algorithm performs in C++, but something similar may be able to be accomplished in Python? I’m not sure… What I do is:

  • Upon creation of the Iterator, the first entry is processed and saved to a variable nextVar.
  • Then on each call to next() of the iterator, the iterator copies the nextVar to currVar.
  • Then the iterator actually processes the n + 1 entry and then saves it to nextVar - then returns the contents of currVar.
  • Then there is a new method on the Generator called peek() which returns the contents stored in nextVar which would be the data returned upon a subsequent call to next() on the iterator in t+1.
  • The variable nextVar is then set to “null” when there are no more entries to process, and the code can check it to see if the generator is exhausted.

@alavin

Finished re-write of TemporalMemory & test for same! I have some integration tests (the MonitorMixin stuff) to fix, and then I’m back to NAB debugging…

1 Like

Awesome!

1 Like

@alavin

Status Report:

Using the version of the TM (with SegmentOverlaps - the one before @mrcslws’ last merge) we have identical outputs until record 45; so that is where my investigation begins today…

The testing method I used was to run the test file through the Python SpatialPooler and save its output. Then I read in that SpatialPooler output and run it through the Java TemporalMemory. This way I can guarantee identical inputs to both the Python and Java TemporalMemorys.

I’ve included GISTS (only the first 100 lines) of both the Python and Java output which shows:

  1. The Record #
  2. The raw input (timestamp, temperature)
  3. The GroupBy2 Column/Segment Generator output
  4. The calls to the UniversalRandom next() method. Next is called with the bounds in parenthesis i.e. next(bound) <-- Even the bounds of the RNG calls are the same!
  5. The TM input (Sparse output of SP)
  6. The TM prediction
  7. The TM active columns (derived from the active cells)

For all of these you will see EXACT output duplication. Even the line numbers of every single output and their format, are identical (Thanks to the UniversalRandom RNG)

So now I’m investigating Record #45 to see why the Java version makes an additional call to the RNG and diverges on that record.

Right now I’m confident the Java NAB “poor results issue” has been fixed, but I just want to take it all the way and get identical output. Also, I need to run things through the Java SpatialPooler and ensure good results. Exact output using the Java SpatialPooler will have to come later because there are one or two “exotic” RNG methods being used by the Python SP. (I already equalized the “sample()” method to be the same in the UniversalRandom, but there may be another method I need to “normalize”).

It shouldn’t take me any more than a day or two to reach my goal of identical TM output, after which I’m not going to play with it anymore (after porting @mrcslws ’ latest merge) - and I’ll run the NAB to verify good results (I already see from the quality of the anomaly scores that the results will be in line with NuPIC, but like I said, I want to find the reason for the slight divergence.)

1 Like

Of what data? Is this consistent with multiple datasets?

I’m looking forward to this! I’d like to see your htm.java pr/branch when it’s ready to run on NAB.

1 Like

@alavin

NAB/data/artificialNoAnomaly/art_daily_no_noise.csv

Anyway. I DID IT! EXACT OUTPUT FOR 4031 RECORDS! Every - Single - Thing - Is - Exact!

(… with regard to the TM and Anomaly Scores, using recorded SP input)

:stuck_out_tongue:

I’m so happy I think I just wet myself a little bit!

EDIT:
@alavin By “exact” I mean I’m using my test harness (which was assembled using the work you and @rhyolight did for QuickTest.py) which assembles the raw algorithms - piping input from one to the other. As input, I’m using the art_daily_no_noise.csv data set from NAB and I output the results produced at each stage in each algorithm’s processing to ensure identical function between the Python and Java versions. I hope that makes sense?

The use of more than one dataset, in this case, is besides the point because my goal was to produce a Java TM that produced identical output (we’ll get to the NAB later :wink: ) - (I even normalized the floating point differences expressed in the adaptSegment() method) - so the same output would be produced regardless of the input dataset file used.

So right now my next step is to wait for @mrcslws’ new code to be merged - update the HTM.Java TM with those changes, and then we’ll run the NAB with the Java Detector which at that point should give us results on par with NuPIC.

3 Likes

Great job, David!

1 Like

Awesome!!

1 Like

Status Report:

Also while waiting for the above merge for the TM update, I’ve started the SP re-write to bring it into sync with changes since it was written - and again, now that I have the handy UniversalRandom, I’m shooting for exact duplication of results using an example NAB data set (running through the ScalarEncoder and SP only). So far I’ve never had issues with encoders not producing exact results - as a side note.

@alavin
Status Report:

SpatialPooler.java has been re-written along with the test. Using the UniversalRandom, the initialization (including all the overlap pools etc.) - and the first 42 lines of output (out of 4,031) are exactly the same as the Python version.

The SP’s complex initialization exposed a flaw (not a bug because it didn’t cause an error or inappropriate behavior, it just caused a deviation when compared to the Python version), in the UniversalRandom RNG which is now fixed (having to do with the floating point overflows in the Java version). Now they run, and compare fine.

While waiting on the new TM updates, I’m now working on this deviation starting at iteration 43 when running the NAB file on the new SpatialPooler. My progress on that can be tracked in this SP PR.

1 Like

@alavin
Status Report:

Re: the SpatialPooler; identical initialization and then identical output for 42 lines, I realize, is the best that I’m going to get. The SP does so many floating point calculations that cumulative differences are unavoidable. I’m grateful that I can prove out the initialization and the first several lines of output though (that is a HUGE win) - that and identical unit tests, go a long way toward ensuring the “compatible operation” between the two language implementations.

The new SP code depends on the new TM branch; even though the latest PR was merged yesterday, it seems that there is one more NuPIC PR that needs to be merged which has some fundamental differences in how TM constructs are referenced internally. So, I will take in both that change and the new one at the same time - updating HTM.Java’s TM with the changes (ensuring identical output), and then check in both the TM PR and the SP PR.

Following that, we will finally be able to run the NAB and see the effect of the changes (which I expect given the rigor of the rewrites will be a resounding success).

How did you determine that floating pt precision is the root cause, and only cause, of discrepancies in SP states? If that’s true, the differences in SP states would be minimal.

And why 42? Are your findings of identical output for 42 lines consistent for multiple data files? If floating pt precision is the cause, I would expect 42 lines to fluctuate for different random initializations, and for different data files.

I recommend sticking with your current TemporalMemory until this anomaly detection debugging is finished. If the HTM.java TemporalMemory gets the same results as last week’s nupic TemporalMemory, then you’re good. You can just use that snapshot of nupic.

So you should be able to run NAB now. No need to keep porting the latest PRs.

In other words, Marcus thinks he’s not blocking this investigation by continuing to change code. :slight_smile:

1 Like

Correct assumption. We talked about this a little at today’s Hackers’ Hangout. In the future, @cogmission will link HTM.Java releases to explicit NuPIC release versions so he won’t always have a moving target.

2 Likes