Debugging CoordinateEncoder issues in NuPIC 1.0

rhyolight · August 14, 2017, 8:41pm

@sheiser1 @BrianJB I cannot replicate this error in either:

minehack, which I just updated to run NuPIC 1.01
nupic.geospatial, which I also updated to NuPIC 1.01

Can either of you share a project or some code I can run to replicate this issue? Everything is working for me.

sheiser1 · August 14, 2017, 9:25pm

Ok interesting, so we just have something in our configs here. I’ll try and give you my full story here. First off here is my modified ‘runIOThroughNupic’ function within my run file.

https://pastebin.com/cT6Ab2AE

The ‘inputData’ passed into it looks like this, just an x,y coordinate and no timestamp since its sub-second:

https://pastebin.com/r1qX16F2

Here’s the model params file:

https://pastebin.com/bjXK4K0T

The other thing I’d note is that the new nupic didn’t appear to have an ‘examples’ folder, so I created one in order to recreate my old path: ‘examples/opf/clients/hotgym/anomaly/one_gym’. The ‘one_gym’ folder was copied over from the earlier version to preserve what I’d done so far, and I wonder if it’s looking in the wrong place for the CoordinateEncoder because of that or something.

rhyolight · August 14, 2017, 9:42pm

Ok, working on it.

Here it is: https://github.com/numenta/nupic/tree/master/examples

rhyolight · August 14, 2017, 9:54pm

What about your calculate_radius() function definition? Is that taken from one of our examples?

sheiser1 · August 14, 2017, 9:59pm

I wrote that function myself, trying to emulate your function from mine-hack:

https://pastebin.com/P3nDR0yw

Do you think there may be an issue there?

rhyolight · August 14, 2017, 10:05pm

A post was split to a new topic: Missing examples directory

rhyolight · August 14, 2017, 10:07pm

I’m not ruling anything out.

rhyolight · August 14, 2017, 10:09pm

A post was merged into an existing topic: Missing examples directory

rhyolight · August 14, 2017, 10:34pm

Ok, I just replicated it! Thanks for the assistance.

rhyolight · August 14, 2017, 11:26pm

Fix in a PR:

https://github.com/numenta/nupic/pull/3762

I will release 1.0.2 soon after this is approved and merged.

BrianJB · August 15, 2017, 4:55am

Thanks to rhyolite and sheiser1 who have worked so hard while I was sleeping. I’m off on holiday in a couple of hours. It may be a week before I return and can check it out.

sheiser1 · August 15, 2017, 2:03pm

Sounds good @BrianJB, have fun! Meanwhile I’ll be here ready for 1.02!

rhyolight · August 15, 2017, 7:44pm

Hey @sheiser1, we don’t need to release a bugfix to NuPIC, just add:

'clEnable': False, 'clParams': None,

in your model params right after tmParams. (Thanks @scott for figuring this out.)

This parameter will disable classification in the OPF, which is necessary because the CoordinateEncoder doesn’t know how to convert columns back into coordinates (the point of classification). But we can still get anomaly indications, so it is actually still quite useful.

sheiser1 · August 15, 2017, 9:59pm

It worked!! Thanks @scott for the great insight and @rhyolight for sticking it out and helping a ton this whole time!

To close out this topic I just want to sanity-check my use of the coordinate encoder with my data. So my data is composed of 2 fields, movement in the ‘x’ and ‘y’ directions. The values are small, usually between -0.5 and 0.5. I see that these values get converted to integers in forming the ‘vector’ that goes into the encoder. In response I simply multiply each value by 100, so raw values of say -0.37 and 0.24 are converted to -37 and 24. Here is what that loop looks like:

https://pastebin.com/RJdjwTDs

Here is a copy of the ‘get_radius()’ function, which was supposed to emulate what Matt did in mine-hack:

https://pastebin.com/iJYyqxX4

I bring this up because I haven’t (yet) had as much success using the CoordinateEncoder as I have with 2 simple scalar encoders, even though the values are coordinates. I think its because the scalar encoder allow me to specify a tight min and max on the scalar values. Is there any sort of equivalent for the coordinate encoder, or anything that looks off in how I’m doing this? I have a hunch that the CoordinateEncoder has the highest potential given that its the most appropriate, and I want to be sure I’m wielding it right! Thanks a million again!

scott · August 15, 2017, 10:40pm

I’m not sure about the specifics but it sounds like everything is working but the anomaly scores aren’t quite what you expect? The granularity could have a big impact. You can multiple by a number other than 100 to convert to integer coordinates to change the granularity, which may have a similar effect to what you were doing with the scalar encoder.

I would try a very course encoding initially (multiply the raw values by 5 instead of 100) and make sure that you see low anomaly scores for predictable sequences after they have been seen a few times. Then increase the precision (multiply by a larger number) as much as you can while still getting good results. Just keep in mind that at low precision, any remotely similar positions will be treated as the same so you will see many incorrectly low anomaly scores. But it could be that multiplying by 100 is either too high (or too low) to get get the right balance with the anomaly scores.

Amitai · August 14, 2017, 5:49pm

Thanks Matt for the prompt answer.
I am afraid though that this bug isn’t fixed in version 1.0.1
Even when I stick to integer coordinates only and even when I don’t care about prediction but only try to run anomaly detection, I still end up with assertion fail: “Should be implemented in base classes that are not containers for other encoders”. Following the code line by line I see it is still due to the fact that getBucketIndices() of the parent base encoder receives a None object for self.encoder.encoders and thus throws the ‘assert False’ line
This is exactly the same bug pointed out by Austin_Marshall and I still reproduce it over and over again with version 1.0.1 unfortunately

rhyolight · August 14, 2017, 6:40pm

Can you please include the entire stack trace you are getting?

rhyolight · August 17, 2017, 2:43pm

Did you also add the above? Not sure if it helps but try it.

Amitai · August 17, 2017, 4:12pm

Unfortunately changing clParams to None reproduces the same error.

Is there any code snippet showing that opf can work with CoordinateEncoder in either 1.0.0 or 1.0.1?

Amitai · August 17, 2017, 4:52pm

The code is so very basic:

import json
import pandas as pd
from nupic.frameworks.opf.model_factory import ModelFactory
import numpy as np

df = pd.read_csv('C:\data-science-projects-code\HTMwrapper\data_in.csv',
                 parse_dates=['dttm'], index_col='dttm')
col_name = ['col1', 'col2']
df.columns = col_name 

encoder_path = 'C:\data-science-projects-code\HTMwrapper\models\CE_2D.json'
with open(encoder_path) as fp:
    model_params = json.load(fp)
model = ModelFactory.create(model_params)

model.enableInference({'predictedField': 'col1'})
enc_radius = 10
for i in xrange(len(df)):
    record = {'dttm': df.index[i],
              'col1': (np.array([int(np.round(df.col1[i])), int(np.round(df.col2[i]))]),
                       enc_radius)}
    result = model.run(record)
    print result

Topic		Replies	Views
Example usage of Coordinate Encoder NuPIC	52	6375	August 17, 2017
Multidimensional encodings using Java API HTM.Java	34	1587	January 16, 2019
NuPIC 1.0.1 bugfix release NuPIC release	2	630	August 8, 2017
Inconsistency between SpatialPooler in nupic and nupic.bindings NuPIC	17	1644	January 17, 2018
Help with a basic python error NuPIC newbie	23	1857	August 30, 2018

Debugging CoordinateEncoder issues in NuPIC 1.0

Related topics