Help debugging HTM.java for anomaly detection

@alavin,

In HTM.Java’s real code, and in the NetworkConsistencyTest just written because of this issue - I store the previous predictions and use those to compare to the current active columns.

This QuickTest.py file was written a year ago before the time I had the “epiphany” about what I was doing wrong in the Network API with the inputs and output.

Sorry, I overlooked this. However the **current / real ** code doesn’t have this oversight. I was hoping it did because that would account for the problems.

Even though I calculate the Anomaly after the call to TM.compute() - the Anomaly is being calculated on the previousPredictiveCells (converted to columns of course) which gets stored before the current call to TM.compute().

I will make the changes to QuickTest.py and QuickTest.Java (if they’re there in that file) and then you can give it a “look-over” to see if they’re correct.

Once again, HTM.Java stores the previous predictions and uses those in its Anomaly calculation here it is.

@alavin

The thinking is to compare code to code seen in the NAB detectors. Doesn’t the code in the “anomaly compute” do the same thing?

@rhyolight Sorry I didn’t look at this right away, I stepped away for most of the day. Thank you very much, I can definitely work with your changes! I will integrate that along with the changes @alavin suggested and we’ll have an actual raw test layer to compare the entire processing chain.

Thanks again!

1 Like

Taking a brief “respite” in the interest of having “identical” behavior, and re-writing the TemporalMemory to be in sync with the new “columnSegmentWalk” approach recently merged into NuPIC…

After that, I will continue with the comparison/analysis…

@rhyolight @lscheinkman @mrcslws @alavin @fergalbyrne

Here’s my implementation of a Python Generator in Java: (Note: Work in progress :stuck_out_tongue: )

Benefits:

  • Able to define a processing loop independently of the iteration mechanism
  • yield() can be called from anywhere in the processing loop (not necessarily just at the end) (like in Python)
  • Can use as an iterator or foreEach Iterable
  • next() blocks until a value is available in case next() is called before the call to yield()
  • Derivates such as an Integer generator can be created with simple syntax (i.e. IntGenerator.of(lowerBound, upperBound))

I’m open to suggestions or critiques though?

…with this I can get near perfect semantic parity with the Python version of the excitedColumnsGenerator() method:

Compare Java version to Python version

:confetti_ball:

I haven’t looked at it closely, but one thing to note: part of the reason we used a generator in Python is because it feels very natural to use them in Python. “yield” is already part of the language, so we’re not introducing conceptual overhead by using it.

In C++ we implement this column/segment walk via iterators: https://github.com/mrcslws/nupic.core/blob/6c6ad5d45a133d4b48c4918df1919e267a949577/src/nupic/algorithms/TemporalMemory.cpp#L69

In other words, you don’t necessarily have to use a generator. It’s probably best to use whatever approach feels best in Java. And maybe adding a generator class is the right approach – I just wanted to point out that it’s not the only option.

2 Likes

Cool, I’ll take a look, thanks for the cpp reference - I’ll see what “feels” best - though I have to say that the Generator approach is very similar to Iterators and the new Streaming functionality of Java 8 - I’ll take a look, and thanks for the feedback, that’s exactly the kind of response I was hoping for!

Hi @cogmission… any updates?

1 Like

Hi @alavin,

Not sure if you’ve been following everything but I just wrote a universal random number generator UniversalRandom (and tested it side-by-side in both languages, confirming identical output) and now I’m re-writing the Java TemporalMemory to bring it in line with the new Python TM (with columnSegentWalk treatment). I’ve been chronicling my progress here (I know this thread is very long so you might not have wanted to read all of the content).

I am due to finish up the TM unit tests hopefully today, and using the new UR I will be able to get the exact same output in both languages. I then plan to use the work @rhyolight did to provide a barebones Layer in Python, and compare that to the Java version of the same thing to see if we get the same anomaly scores (we should).

If they are different, I will record the output of the SP and just use a direct TM -> Anomaly assembly pumping in the same SP SDRs and see if we then get the same output. We HAVE to at that point.

Anyway so I will move up and down the level of breaking things down until I find where the “departure” is - possibly re-writing the SpatialPooler too (there have been updates since I wrote the current one in 2014/2015.

So I’m doing preparatory work to break things down - with the goal of getting identical output.

1 Like

Status Update:

Using the new UniversalRandom RNG, I have finished re-write of the TM and have completed 13 tests (so far) - all with exactly the same output (same cell, segment and synapse indexes chosen from burst etc.)…

I think I’m about half way through the tests…

I’ve read every word :wink:. This project is important to me.

Keep in mind when comparing the anomaly scores you don’t want to use the anomaly likelihood calculation. In NAB this corresponds to the “raw scores”.

Thank you for the update, keep it going!

1 Like

No worries. I know this. It’s just that there’s so much here and I think I misspoke because some of the other stuff I mentioned is actually in other threads, I believe and not here. I really appreciate your support Alex!

Cheers

@alavin

TemporalMemory & Test re-write complete! :confetti_ball:

Next step: Assemble @rhyolight’s handy work with yours and my own to make rudimentary layer in both Python and Java and get back to testing NAB! (probably starting tomorrow because I’ve already been working 13 hours [since 1:00am])… Just a heads-up! (I’ll check the code in for merging tomorrow, I have to remove the old stuff and make sure the TemporalMemoryMonitorMixin stuff still works)

1 Like

@alavin @rhyolight @fergalbyrne @lscheinkman @mrcslws

Status:

Upon removal of the “old” TemporalMemory, I have more than a few tests to examine; either for fixing/adjustment, removal or rewrite - depending on the nature of the failure. This is to say that the anticipated step of swapping the old for the new is a bit more involved than previously expected.

My current goal is for the new code to be error free by Monday, and to start back in on preparing the skeleton code to be able to return to NAB testing then.

Cheers,
David

1 Like

A post was split to a new topic: Potential bug in spatial pooler

@alavin

Hi I just want to make sure exactly what is supposed to go into the Anomaly.computeRawAnomalyScore() method? I can’t believe I’m asking this again, but this is all so confusing… :stuck_out_tongue:

I’m pretty sure Subutai said that you take:

  1. Take the output of the SpatialPooler which represents the “activeColumns” coming in to the TemporalMemory.
  2. The previously predictive columns (as derived from the predictive Cells in t - 1)
  3. So the comparison is between the predicted input of the TM in t - 1 and the current input coming from the SP?

But now I’m not sure if you take the currently activated columns from the TM and not the TM’s input?

Also above you probably meant “columns” but you said “cells” and I just want to make sure?

So I just need to clear this up?

Sorry for the delay @cogmission, on PTO…

To reiterate this:

	for each timestep t: 
		1. run data through encoder
		2. run SP computation
		3. get values for raw anomaly score calculation:
			a = representation of input at time t (i.e. SP currently active columns)
			b = representation of prediction at time t-1 (i.e. columns of TM's currently predicted cells)
		4. compute raw anomaly score:
			rawScore = 1 - b/a
		5. run TM computation
1 Like

@alavin,

Sorry for contacting you on your PTO (Matt also informed me that you were away until the end of the week)… …and thank you for responding!

That clears that up. I just wanted to make sure.

Hi David, any updates on this?