Ok interesting, so we just have something in our configs here. I’ll try and give you my full story here. First off here is my modified ‘runIOThroughNupic’ function within my run file.
The other thing I’d note is that the new nupic didn’t appear to have an ‘examples’ folder, so I created one in order to recreate my old path: ‘examples/opf/clients/hotgym/anomaly/one_gym’. The ‘one_gym’ folder was copied over from the earlier version to preserve what I’d done so far, and I wonder if it’s looking in the wrong place for the CoordinateEncoder because of that or something.
Thanks to rhyolite and sheiser1 who have worked so hard while I was sleeping. I’m off on holiday in a couple of hours. It may be a week before I return and can check it out.
Hey @sheiser1, we don’t need to release a bugfix to NuPIC, just add:
'clEnable': False, 'clParams': None,
in your model params right after tmParams. (Thanks @scott for figuring this out.)
This parameter will disable classification in the OPF, which is necessary because the CoordinateEncoder doesn’t know how to convert columns back into coordinates (the point of classification). But we can still get anomaly indications, so it is actually still quite useful.
It worked!! Thanks @scott for the great insight and @rhyolight for sticking it out and helping a ton this whole time!
To close out this topic I just want to sanity-check my use of the coordinate encoder with my data. So my data is composed of 2 fields, movement in the ‘x’ and ‘y’ directions. The values are small, usually between -0.5 and 0.5. I see that these values get converted to integers in forming the ‘vector’ that goes into the encoder. In response I simply multiply each value by 100, so raw values of say -0.37 and 0.24 are converted to -37 and 24. Here is what that loop looks like:
I bring this up because I haven’t (yet) had as much success using the CoordinateEncoder as I have with 2 simple scalar encoders, even though the values are coordinates. I think its because the scalar encoder allow me to specify a tight min and max on the scalar values. Is there any sort of equivalent for the coordinate encoder, or anything that looks off in how I’m doing this? I have a hunch that the CoordinateEncoder has the highest potential given that its the most appropriate, and I want to be sure I’m wielding it right! Thanks a million again!
I’m not sure about the specifics but it sounds like everything is working but the anomaly scores aren’t quite what you expect? The granularity could have a big impact. You can multiple by a number other than 100 to convert to integer coordinates to change the granularity, which may have a similar effect to what you were doing with the scalar encoder.
I would try a very course encoding initially (multiply the raw values by 5 instead of 100) and make sure that you see low anomaly scores for predictable sequences after they have been seen a few times. Then increase the precision (multiply by a larger number) as much as you can while still getting good results. Just keep in mind that at low precision, any remotely similar positions will be treated as the same so you will see many incorrectly low anomaly scores. But it could be that multiplying by 100 is either too high (or too low) to get get the right balance with the anomaly scores.
Thanks Matt for the prompt answer.
I am afraid though that this bug isn’t fixed in version 1.0.1
Even when I stick to integer coordinates only and even when I don’t care about prediction but only try to run anomaly detection, I still end up with assertion fail: “Should be implemented in base classes that are not containers for other encoders”. Following the code line by line I see it is still due to the fact that getBucketIndices() of the parent base encoder receives a None object for self.encoder.encoders and thus throws the ‘assert False’ line
This is exactly the same bug pointed out by Austin_Marshall and I still reproduce it over and over again with version 1.0.1 unfortunately