Hi all,
In the process of improving NAB by testing anomaly detection algorithms for addition to the repo, @lscheinkman has opened a PR that implements the HTM.java detector. Matching the parameters to the current HTM detector, we would expect very similar results, but are finding that the HTM.java detector’s scores on NAB are on par with making random detections.
@cogmission has done a careful job of ensuring 100% test viability between the HTM.java and NuPIC implementations, implying there is a bug
somewhere not covered by the tests
on the use of HTM Java in NAB
or some implementation differences between java and python that would only manifest in certain situations
Would anyone in the community be able to help us out and investigate this issue? An initial step could be comparing hotgym results using both NuPIC and HTM.java, moving on to NAB data files later.
I was looking at this a few months ago, ran out of steam. Why not simply run HTM.java over the benchmarks outputting raw anomaly scores and compare them with NuPIC’s (in R or similar)?
Yes there are many differences in the output raw scores, which are written out for every data file in NAB. The debugging would call for isolating the discrepancies in order to find their root causes; i.e., running NuPIC and HTM.java models on one of these data files, and investigating the model states during the streams (specifically at points where the raw anomaly scores are very different).
I was hoping someone might have time to fix the conflicts on the PR so that I can download it, and run it?
(Apparently you can’t download a PR if it has conflicts; or at least the “command line” link isn’t present to give you the options. I got the commands from another working PR but the files still have conflicts which I’m not Python savvy enough to solve.)
Also, can you guys tell me how to run the htm.java detector once I have the PR downloaded?
====
Re: The detector code
I took a look at the HTMModel.java file and everything looks good to me! I don’t see anything that sticks out as a Java no-no So that’s good!
The above gist has a test class which illustrates the consistency between the NAPI and a raw assembly of algorithms (only the ScalarEncoder -> SpatialPooler -> TemporalMemory right now).
I haven’t yet added the Anomaly classes but I will do that next. I just thought the setup might be useful and especially the confirmation that the NAPI is totally determinant and consistent with at least these components. You will see that the output and inputs at various stages is exactly the same between the two.
Hi David,
Thank you reviewing the java code, I wasn’t sure if I missed anything setting up the network.
I’ve fixed the merge conflicts and was able to install and run the detector following the README instructions.
Let me know if it works for you now.
Luiz
@cogmission thanks for that. My concern is not about the difference between the outputs when using the Network API or just a Layer, nor about the Anomaly class (which is just being used functionally). It’s about the interaction at the Layer level between the TemporalMemory and a custom Observer which is possibly also mutating the TM, or not mutating itself properly, or both. This Observer stuff is not used in the NuPIC NAB code, and we’ve already discovered that it needs better testing, so that’s where my nose is leading me right now.
The result of this work will be a much more robust HTM.java, especially for anomaly detection, and hopefully an answer and a fix for the trouble @lscheinkman is having. So hats to all for rolling up sleeves.
@fergalbyrne Sure no problem. I understand where your focus is, which is why I’m helping out by eliminating possibilities. If we know that the NAPI has no impact on the algorithm output (or that it does and where that impact is), that takes us a very long distance toward identifying the problem. Due to the above test, we now know that there are no discrepancies in the NAPI in regards to its event emissions between an Encoder, SpatialPooler, and TemporalMemory with proof.
Btw, the TemporalMemory cannot be mutated as it has no state, and the state in the Layer is never used internally, it is only there to provide (another, other than the Inference that is passed between nodes) convenient place for a user to query for the current state. Again, to be clear, accessor methods on the Layer are not used internally they are there so the user can get state from the Layeras well as the Inference container object passed to the Layer.
Some background. The Observable framework is used in the NAPI to provide a convenient way to:
Connect together any type of algorithm dynamically at runtime.
Dynamically at runtime allow a Layer to “adapt” to the type of data it is receiving at its inputs. (i.e. The user can submit an Inference, int[], String[], or Map).
Functions can be added easily to allow for other types of input.
Provide a common “language” of transmission between “nodes” (Observables) in the Layer and a means of defining the emitted item and operations on that item (Inference).
Make it easier to do things like dynamically allow an Observable node to infer and adapt its input dimensions to remove the necessity for the application developer to manually hard code the dimensions of inputs into each Layer.
Allow a user to plugin a custom Observable that could house a desired algorithm without any manual configuration, just simply connect it. This is allowed due to the interface to the inputs and outputs being pre-designed.
The emitted Inferences can optionally participate in other map/reduce and combination operations peripheral to the NAPI utilizing a common idiom (ReactiveX idioms) and methods.
Interact with Java 8 Streams to allow common uptake of pull-based Streams into push-based ReactiveX event emissions - used with the Sensors. Additionally there is an over-arching readiness for parallelism if determined to be desired.
Allows for a very concise and compact expression of a Network assembly and configuration. (A tad bit more taxing for a NAPI developer, but extreme ease for a user).
I asked this question a while back of @Subutai, but I want to confirm once again.
The Anomaly class takes the previously “predictive Columns” from t-1 and not the “predictive Columns” from “t” correct? (in addition to the current column activations).
To be clear: I’m saying that the Columns (not cells) from the last TM cycle (which predicts what should be currently seen as active at the input) PLUS the currently active columns are given to the Anomaly code each compute cycle, correct?
It would make sense that the Anomaly code would look at what was input and compare it to what was predicted to be input, but I just want to be sure that it doesn’t use the current prediction for what would be input in the next cycle (which wouldn’t make much sense at all, but I just want to be sure).
@alavin Yeah my version of NuPIC is probably very old. I think like around the time Yuwei did the “Orphan Segment” code…
I’ll bug @rhyolight to assist me in updating my NuPIC and then run it again…
By the way, I’ve identified the problem!
It definitely is in the Anomaly function in the Layer class (previousPredictiveCells is not set or overridden) - I’m about to fix it now… woohoo! (I think @fergalbyrne hinted at this before)
Yes, the predictions from t-1 are compared to the activations at t to calculate the raw anomaly score. The final anomaly score is then computed with the anomaly likelihood algorithm.
Bad news (in a way), I wasn’t using the previous “predictiveColumns” in my control (my “control” is the file which merely assembles all the algorithms and runs the data straight through). I was instead using the current “prediciveColumns” so that is why there was a disparity between the control and the NAPI.
Now they’re exactly the same! So there is no problem in the Network API… because it gets the same output as the straight algorithms…
So Alex, I want to try something else. I’d like to have you @lscheinkman, @rhyolight or someone else who can do it, add the Anomaly functions to this code which is the Python version of my “control” file (almost exactly), and then I will run the same data with all 3 and see if the output is similiar (it won’t be exactly the same due to the RNG differences).
So here is the Gist containing the Python file. If you could correctly add the Anomaly code in this file, then I can have a tool for comparison…
Specifically add it here: (on line # 81)
This way I can also look at what you’re handing the Anomaly code just to be sure…
EDIT: Also Alex, Luiz or Matt - if you could correct any Pythony badness in that file (I’m just learning Python) that would be good (and also point it out to me so I can learn :-)) because that public Gist is my go-to file when someone from the community asks how to experiment with “piecing together” the algorithms themselves in Python. So it’s a good experimental file because it shows everything, users can see the predictions of something simple and see how to interact with the api to get the results!
@cogmission would you please explain more what the intent of this python code is? That is, how are you using it to debug and/or test the java code? I ask because there may be better options (than adding more code to the mix that may introduce more bugs). Thanks.
The idea behind the Python code is to assemble the raw algorithms together in such a way where there is no obscuring from infrastructure code. I can see what is going in and coming out of each algorithm, and If you “vet” the code, I can see once and for all what goes into what and what is expected out.
See, the problem is that there aren’t any tests that show how things are connected. There are many great tests for the algorithms and then there are (some) integration tests for infrastructure code such as the NetworkEngine, but there is no documentation saying, “this” goes into this and this is what is expected as output/input for this algorithm.
I don’t really want to argue whether the above is true or not - if somebody feels that there are plenty of examples of the above then please, point me to them (but please help me out with this request as well)
Summary: I would like to see everything pieced together in an officially sanctioned pseudo region or FauxRegion if you will and then I will do the same with the Java code (using the same data and I will see how long it takes the TM to produce good predictions, I will see when the SP stabilizes, I will see when the classifier reports the expected predictions and I will see the Anomaly code settle in to anomalies of 0.0.) After that I will introduce one or more anomalies in both the Python and Java versions and see exactly how they both handle them - and that should give me an idea of how the Java version is performing and perhaps some insight into whatever is the problem?
I hope that makes sense?
EDIT: Another point. I ask for this because I can reproduce the exact same thing in Java, so there is a one-to-one correspondence between the two and I can play with each and see how they perform. (i.e. I should see the exact same reaction (nearly) in the Java version as I do the Python).