Help debugging for anomaly detection


This exception is caused by the incorrect path to the configuration node:

JsonNode modelParams = params.path("modelConfig").path("modelParams");
Exception in thread "main" java.lang.IllegalArgumentException: Cannot initialize this Sensor's MultiEncoder with a null settings
	at nab.detectors.htmjava.HTMModel.<init>(
	at nab.detectors.htmjava.HTMModel.main(

The next exception I got was this: caused by no resolution in the config JSON file:

Exception in thread "main" java.lang.IllegalStateException: Resolution must be a positive number
	at org.numenta.nupic.encoders.RandomDistributedScalarEncoder.init(
	at org.numenta.nupic.encoders.RandomDistributedScalarEncoder$
	at org.numenta.nupic.encoders.RandomDistributedScalarEncoder$
	at org.numenta.nupic.encoders.MultiEncoderAssembler.assemble(
	at nab.detectors.htmjava.HTMModel.<init>(
	at nab.detectors.htmjava.HTMModel.main(

This exception was then caused by not having “-s 1” (the skip configuration of 1)

Exception in thread "Sensor Layer [NAB Region:NAB Layer] Thread" java.lang.IllegalStateException: java.lang.IllegalArgumentException: Invalid format: "timestamp"
	at org.numenta.nupic.encoders.MultiEncoder.encodeIntoArray(
	at org.numenta.nupic.encoders.Encoder.encode(
	at java.util.Spliterators$IteratorSpliterator.tryAdvance(
	at java.util.Spliterators$1Adapter.hasNext(
	at java.util.Iterator.forEachRemaining(
	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(
Caused by: java.lang.IllegalArgumentException: Invalid format: "timestamp"
	at org.joda.time.format.DateTimeFormatter.parseDateTime(
	at org.numenta.nupic.encoders.DateEncoder.parse(
	at org.numenta.nupic.FieldMetaType.decodeType(
	at org.numenta.nupic.encoders.Encoder.getInputValue(
	at org.numenta.nupic.encoders.MultiEncoder.encodeIntoArray(
	... 20 more

That’s it. That should be all you might encounter if you don’t have the fixes I mentioned in place… this is just to help while you configure the NAB settings to employ these remedies. Hopefully this helps you because I’m pretty sure you’ll have to change a file name and/or alter what file is pointed to and then make sure the params are setup correctly…

Again, I’m wondering if “n” and “w” need to be put in that JSON file as well?


I changed the declaration location of the output and input variables back to where they were and removed the finally clause because for some reason the local final pointer to System.out couldn’t be copied to another variable? Very weird but I stopped getting output to Standard Out! That’s one for the record books! Does not make sense…

So if you’ve copied the file from the gist before this post, please get it again with the latest changes.


The RDSE only needs the desired resolution, which it calculates via numBuckets in the config.

@cogmission @lscheinkman I’ll try running this before we chat later this morning.


@alavin I don’t think HTM.Java has a numBuckets parameter, so I may have to make allowances for this param in the RDSE’s Network initialization… That’s probably why there is an exception surrounding resolution

Thank you!

EDIT: I’m going to add an issue for this…


You don’t need to add an issue for this, although it has been a point of confusion for me as well. See:

The RDSE wants a resolution, but in order to get a decent resolution, you need to know the min/max of the data. That’s all this calculation does. It cannot be “baked into” the RDSE without knowing about the data.


I only intended this for HTM.Java?

The numBuckets parameter can be specified from an external config file in the NAB (and therefore NuPIC) but it not in HTM.Java.

The issue I mean to file is in HTM.Java to be able to specify this externally and have the NAPI pick it up and apply it when found?

@rhyolight - Please Check this out:

…and tell me if you still recommend I not add this to HTM.Java?


It’s really up to you. I don’t like having configuration for an encoder that isn’t a direct pass through into the encoder, but requires code to interpret it.


I understand… That’s exactly what we do. The NAPI takes parameters and applies them to each encoder, that’s all. By putting things in the Parameter class, we avoid asking the user to manicure each point in each component’s setup. However, this isn’t being done for the RDSE at the present time…

Otherwise the user has to write scaffolding code like what is found within the detector and other places. I’m trying to remove as much of this as possible…


Upon further inspection of your gist, @lscheinkman and I are confident this is not the problem. The “fixes” you called out are handled in

FWIW, to run one data file through NAB at a time for debugging purposes, I recommend modifying the NAB runner to detect one dataset: simply change the multiprocess call, args) to detectDataSet(args[0]). This allows you to debug your detector ( on a single file without having to modify any of its internals. Alternatively, the less hacky way of doing this is detailed here in the NAB readme.



Can someone give me explicit detailed instructions for how to run both the Python and the Java versions with the file of my choosing?

This is Greek to me. How do I run it? What are the instructions for running it for both the Python and the Java side of things? How do I specify a single file in that line? Do I need a path to the file? I don’t use Python, so I need this explained. Actually, this is where I said before I would hand this off to you and Luiz, but I just want to see the outputs of both languages to see the quality of the Anomaly scores for both? I also want to see what the internal parameters are for both so I can verify in a matter of fact way what is happening?

Also, from the meeting I asked for outputs from the same input file of both the Python and Java versions so I could see how the quality of the anomaly scores compares?

EDIT: From what I’ve read you can’t get the scores when this is run in “one-file” mode? I feel like I’m blindfolded with one hand tied behind my back. I can’t debug across Python and Java processes using my IDE (Eclipse). I can’t output things to Standard Out because that’s being coopted for inter-process communication between Python and Java - so how am I supposed to debug this? I can’t even get the scores so that I can compare the quality of results between Python and Java without doing a 30 minute run!?!? :stuck_out_tongue:

This is why I asked for to be updated with simply the Anomaly code so that I can compare it directly to with no framework indirection; but I’m given some NetworkAPI stuff that is useless to me because I need to see everything working and not a black box I can request results from. I’m really at a loss here for what to do next? (I’m very grateful for @alavin’s effort to do this, but it is useless to me as a means to do porting work).

I’m in crisis mode here…


Here’s an example of debugging an algorithm by running one NAB file – the “realKnownCause/nyc_taxi.csv”. In nab/, this would be your detect() method:

	def detect(self, detectors):
		"""Generate results file given a dictionary of detector classes

		Function that takes a set of detectors and a corpus of data and creates a
		set of files storing the alerts and anomaly scores given by the detectors

		@param detectors     (dict)         Dictionary with key value pairs of a
																				detector name and its corresponding
																				class constructor.
		print "\nRunning detection step"

		count = 0
		args = []
		for detectorName, detectorConstructor in detectors.iteritems():
			for i, (relativePath, dataSet) in enumerate(self.corpus.dataFiles.iteritems()):

				if self.corpusLabel.labels.has_key(relativePath):
					if "nyc_taxi" in relativePath:

					count += 1

		#, args)

And then I run from the command line python --skipConfirmation -d htmjava --detect. NAB automagically writes the detection results to CSV files. For this example you’ll find it in “/results/htmjava/realKnownCause/htmjava_nyc_taxi.csv”. If you want to compare to any results file with the numenta detector, all of those are in the numentaTM dir of the repo results.

If you want NAB to detect and score for a single (or subset) of files, follow the approach I previously linked: Your command line run would be python -d htmjava --detect --score --windowsFile labels/combined_windows_tiny.json. You’ll see that after the scoring step, NAB writes scoring info into those same results CSVs.


I looked for examples of this, but they all involve an abstraction through a Region. I don’t see any examples of SP / TM ==> Anomaly.



We can’t be satisfied with that. It’s obvious that whoever wrote the Region knows how to pass the data and what should be passed? If it’s done within a Region it can be done outside of it? Can you ask the author or an expert who knows what should be passed and how? I mean the Anomaly tests do it - why can’t we do it? I mean after all - I even did it. (In Java, and maybe not correctly - which is why I need to see it for myself in Python) :stuck_out_tongue:


You don’t need an expert to tell you that. I can tell you that. I think we’ve already discussed what should be passed. The Anomaly.compute() function takes the current active columns from the SP, the previously predicted columns from the TM, and the raw input value.

You can see this in the CLAModel:


I don’t understand why you wrote this then?



Thank you Alex. I really appreciate that. This should be very helpful in diagnosing things.


You have been asking for an explicit example of data coming from a SpatialPooler instance and a TemporalMemory instance, passing directly into an Anomaly instance, right? I don’t see any examples of this that do not involve the Region abstraction. But you can see what the Anomaly.compute wants just by looking at how it is used in the CLAModel. But that data is not coming directly from SP / TM objects like you are asking.


Oh… I see what you’re saying… You aren’t telling me my request is impossible to help with, right? fingers crossed :slight_smile:


No, of course not. But it is going to take some time to put together the example you are asking for.


I finally see (from the other side) why I was taught something when I used to write programs for Stock Brokers. My manager told me that…,

“…sometimes people don’t always want to hear every little step in your thinking or approach to a problem because they view every interim step as something final. Sometimes you just need to keep it to yourself until you come to them with a solution.”

Now I understand why, and why I used to get Stock Brokers freaking out on me… " :wink:

Err… Besides the fact that they were freaks in a very tense environment to begin with… :stuck_out_tongue: