Help debugging for anomaly detection


Also this is precisely what I am testing. The Observer versus no Observer. Can you please point to where the NAB is getting its data if you know otherwise? Please? Pretty please with sugar on top? :stuck_out_tongue: Instead of just stating this?


Also, Donald Trump and the presumed intelligence of the American People, is really getting me down. Am I working day in day out to improve the future living conditions of the audience of Jerry Springer? I think I’m actually going to move to Barcelona!


I’m not sure what you mean. The NuPIC build hasn’t been broken in a long time. If you’re having problems installing NuPIC, please post a message about your installation method and the errors you’re getting on the #nupic forum?


Maybe Travis was experiencing hiccups - I just glanced over at NuPIC while I was building HTM.Java and it was red…


I seem to have been reading some other version of the NAB code, which ran the model synchronously and just used Anomaly.computeRawAnomalyScore based on using data from the ManualInput. Apols.

Just checked the code, and indeed the NAB code is just going through the Network. But the Network just delegates to the Layer, which uses its anomalyFunc (around line 2360), which uses methods and members of the ManualInput to get the anomaly score.

The Anomaly the NAB code uses is got using Anomaly.create(), which by default uses KEY_MODE Mode.PURE, and the Anomaly is created with useMovingAverage = false, so the Anomaly used in Layer’s anomalyFunc just calls and returns computeRawAnomalyScore() on the data it gets from the ManualInput.

So, despite the many layers of indirection, the NAB code is indeed using computeRawAnomalyScore() on the data from an Observer.


I’m not quite sure what your conclusion is, but I agree the source of the score is coming from exactly what you determined. Are we supposed to be using a moving average? Is this not what is expected?


This just shows that the wrong anomaly score is being computed using the correct code. If the wrong output is being produced by correct code then the source of the problem is the inputs to that code. That’s why you need to check the following:

  1. the activeColumns sparse int[] and the previousPredictedCells (as a sparse int[] of column index) from the TM with no Observer attached to the Layer or Network (ie, by running the Layer manually input by input).
  2. the mi.getFeedForwardSparseActives and mi.getPreviousPredictiveCells (also as a sparse int[] of column indices) from the ManualInput attached to the Network/Layer.
  3. Compare.


By the way, if anyone wants me to set up a hangout to discuss this issue, let me know and I’ll coordinate a time between all interested parties that works, and host the hangout.


@rhyolight that might be the best thing to do. I’m not free for that until early next week…

Meanwhile, I just finished getting this test working. It’s called LayerTest.testTMAnomalyInteraction(). It sets up two identical Layers, and adds an Observer which does the anomaly computation by using the ManualInput’s data and methods, the other Layer is unobserved. They both get the same data (which is just a 7-element sequence repeated 600 times), and then they both get an input skipping one record. Here’s the output:

seq: 4209  --> 2  output = [21, 75, 14, 17, 23, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113], 
				 cols = [3, 12, 2, 8, 9, 17, 18]
recordNum: 4209
l.ffActiveColumns:	[]
o.ffActiveColumns:	[2, 3, 8, 9, 12, 17, 18]
l.prevPredColumns:	[2, 3, 11, 12, 13, 14]
o.prevPredColumns:	[2, 3, 11, 12, 13, 14]
anomalyScore:		0.5714285714285714
l2.recordNum: 4209
l2.ffActiveColumns:	[2, 3, 8, 9, 12, 17, 18]
l2.prevPredColumns:	[2, 3, 11, 12, 13, 14]
anomalyScore:		0.5714285714285714
seq: 4210  --> 3  output = [76, 49, 20, 74, 107, 102, 52, 103, 22, 112, 16, 53, 72], 
				 cols = [12, 8, 3, 17, 18, 2]
recordNum: 4210
l.ffActiveColumns:	[]
o.ffActiveColumns:	[2, 3, 8, 12, 17, 18]
l.prevPredColumns:	[1, 2, 3, 5, 7, 8, 9, 11, 12, 16, 17, 18, 19]
o.prevPredColumns:	[1, 2, 3, 5, 7, 8, 9, 11, 12, 16, 17, 18, 19]
anomalyScore:		0.0
l2.recordNum: 4210
l2.ffActiveColumns:	[2, 3, 8, 12, 17, 18]
l2.prevPredColumns:	[1, 2, 3, 5, 7, 8, 9, 11, 12, 16, 17, 18, 19]
anomalyScore:		0.0
seq: 4211  --> 4  output = [103, 48, 102, 12, 13, 14, 15, 16, 17, 42, 43, 44, 45, 46, 47, 54, 55, 56, 57, 58, 59, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119], 
				 cols = [17, 8, 2, 7, 9, 18, 19]
recordNum: 4211
l.ffActiveColumns:	[]
o.ffActiveColumns:	[2, 7, 8, 9, 17, 18, 19]
l.prevPredColumns:	[1, 3, 8, 17]
o.prevPredColumns:	[1, 3, 8, 17]
anomalyScore:		0.7142857142857143
l2.recordNum: 4211
l2.ffActiveColumns:	[2, 7, 8, 9, 17, 18, 19]
l2.prevPredColumns:	[1, 2, 7, 8, 9, 17, 18, 19]
anomalyScore:		0.0

[edit: this is running of a possibly old version of, so might be showing bugs which have been fixed - @cogmission please advise]


@fergalbyrne @cogmission @alavin @lscheinkman Check your emails… :slight_smile:



Discrepancies between the Layer and the Observable are not important. The Layer’s accessors (for output data that is) are returning data that is constantly in a race. The only data that is really pertinent is the data in the in the Inference returned by onNext(). I may just remove those Layer accessors to make that point more clear?

Does clear up your test? Otherwise I’m not exactly sure what you are testing?


@rhyolight @fergalbyrne

I appreciate the conversation effort, but it’s still too early for this. I’m still doing my analysis and I need to get @alavin’s update to that file before I can continue my analysis. After which, I’m quite confident I’ll find the issue. If I don’t at that point then maybe a confab will be necessary…


I totally disagree, David. This is us trying to help. There is still a lot of confusion about where the problem lies, and a conversation is the best way to clear the communication channels. Please accept and pick a time? Worst that could happen is 30 minutes wasted, and I am certain this won’t be wasted time for you.


I’m running two identical Layers on identical inputs. One of them (l) is observed, the other (l2) isn’t. By the time this code gets run, both TM’s have seen the same 7-input sequence 600 times, so they predict it perfectly. When I introduce an anomaly (#4209) by skipping an input, both produce the same TM outputs and the same anomalyScore. But for record #4211, the observed TM is producing a different prevPredictedColumns() output than the naked Layer’s TM. This record is generated at timestep #4210 (ie just after the anomaly). So an Observed TM is behaving differently to an unobserved one.



I see! Can you please put those methods back into SDR so I can merge your code and then run this test and have a look?

This would be very helpful if I can confirm what you’ve found! :slight_smile:

Please move those methods or we’re going to sit here in an impasse because I’m not merging those methods… And you are doing some important work which we need to share?


I have to go AFK until tomorrow, don’t have time to move those methods now - I’m late already to meet someone flown in from Switzerland. Try copying the test (look in my fork).



Ok will do… Thanks!


I just fetched the detector and ran it (with the most recent nupic):

Final score for 'htmjava' detector on 'reward_low_FP_rate' profile = 3.46
Final score for 'htmjava' detector on 'reward_low_FN_rate' profile = 13.00
Final score for 'htmjava' detector on 'standard' profile = 9.59


I’m using the Network API on purpose here because that’s how NAB interfaces with NuPIC, and how users should in general. Within your example I’ve shown you the data inputs and outputs to each region, just as you specified. I’ve also made it clear how to access this and other information from the network regions – i.e., via self.sensor, self.sp, and All of this was fit into the framework you provided, so it should be very straightforward for you to use this script in parallel with; I had assumed this was your intent :wink:.

As @fergalbyrne pointed out, the anomaly score calculation is simply the following edit:

    # ... temporal memory
    activeCells ="bottomUpOut")
    predictedActiveCells ='predictedActiveCells')
    print "TemporalMemory Output (active cells) = "
    print "\t", activeCells.nonzero()[0]
    print "TemporalMemory correct predictions (active cells that were previously predicted) = "
    print "\t", predictedActiveCells.nonzero()[0]

    print "Raw anomaly score =", 1.0 - predictedActiveCells.sum() / activeCells.sum()


Hi Alex,

Thank you for getting back to me.

Regarding the above, that is not my concern at all (I’m not concerned with what users should do as much as I need to see how to build the Network). Think of me as someone who has to build a framework from your example, not just another user. I need to deeply debug and confirm everything. This code does not show me what is going into the Anomaly code at all? This is not what I need because I have to have the fewest impediments to debugging as possible (no indirection) and I don’t want to debug into Network code. I can’t use a “black box” that I can get things from, I need to see how everything is created… I need to see how the data is prepared for entry into the Anomaly code (i.e. where and what is being taken from the TM and how it is being transformed to go into the Anomaly code)

I can’t use complex code for debugging this. I need everything in front of my face…

EDIT: Also, what field in the TM does this “predictedActiveCells” come from? See, these are precisely the questions I don’t want to have to ask… If I have to ask what or where something is or comes from then the code isn’t basic enough… :wink:

EDIT: Also, I need to see all pieces interact is a simple example. Then I can build the same simple example in Java and see that it works. Then from there I can confirm all the features and framework I have built in Java - around this basic understanding. But right now, I’m not sure that i’m doing anything beyond the encoder, SP, TM algorithms - right… What will make me confident is seeing every little tiny step?

I really do apologize for the confusion and if this is kind of a pain to acquiesce to… But I kind of have a difficult debugging session in my future and I need to do it in as straight forward a manner as possible. :wink: (which means not bumping into Obscure Swig code in the Network interfaces etc…)