Debugging CoordinateEncoder issues in NuPIC 1.0

@sheiser1 @BrianJB I cannot replicate this error in either:

Can either of you share a project or some code I can run to replicate this issue? Everything is working for me.

Ok interesting, so we just have something in our configs here. I’ll try and give you my full story here. First off here is my modified ‘runIOThroughNupic’ function within my run file.

https://pastebin.com/cT6Ab2AE

The ‘inputData’ passed into it looks like this, just an x,y coordinate and no timestamp since its sub-second:

https://pastebin.com/r1qX16F2

Here’s the model params file:

https://pastebin.com/bjXK4K0T

The other thing I’d note is that the new nupic didn’t appear to have an ‘examples’ folder, so I created one in order to recreate my old path: ‘examples/opf/clients/hotgym/anomaly/one_gym’. The ‘one_gym’ folder was copied over from the earlier version to preserve what I’d done so far, and I wonder if it’s looking in the wrong place for the CoordinateEncoder because of that or something.

Ok, working on it.

Here it is: https://github.com/numenta/nupic/tree/master/examples

What about your calculate_radius() function definition? Is that taken from one of our examples?

I wrote that function myself, trying to emulate your function from mine-hack:

https://pastebin.com/P3nDR0yw

Do you think there may be an issue there?

A post was split to a new topic: Missing examples directory

I’m not ruling anything out.

A post was merged into an existing topic: Missing examples directory

Ok, I just replicated it! Thanks for the assistance.

1 Like

Fix in a PR:

https://github.com/numenta/nupic/pull/3762

I will release 1.0.2 soon after this is approved and merged.

1 Like

Thanks to rhyolite and sheiser1 who have worked so hard while I was sleeping. I’m off on holiday in a couple of hours. It may be a week before I return and can check it out.

Sounds good @BrianJB, have fun! Meanwhile I’ll be here ready for 1.02!

Hey @sheiser1, we don’t need to release a bugfix to NuPIC, just add:

'clEnable': False, 'clParams': None, 

in your model params right after tmParams. (Thanks @scott for figuring this out.)

This parameter will disable classification in the OPF, which is necessary because the CoordinateEncoder doesn’t know how to convert columns back into coordinates (the point of classification). But we can still get anomaly indications, so it is actually still quite useful.

It worked!! Thanks @scott for the great insight and @rhyolight for sticking it out and helping a ton this whole time!

To close out this topic I just want to sanity-check my use of the coordinate encoder with my data. So my data is composed of 2 fields, movement in the ‘x’ and ‘y’ directions. The values are small, usually between -0.5 and 0.5. I see that these values get converted to integers in forming the ‘vector’ that goes into the encoder. In response I simply multiply each value by 100, so raw values of say -0.37 and 0.24 are converted to -37 and 24. Here is what that loop looks like:

https://pastebin.com/RJdjwTDs

Here is a copy of the ‘get_radius()’ function, which was supposed to emulate what Matt did in mine-hack:

https://pastebin.com/iJYyqxX4

I bring this up because I haven’t (yet) had as much success using the CoordinateEncoder as I have with 2 simple scalar encoders, even though the values are coordinates. I think its because the scalar encoder allow me to specify a tight min and max on the scalar values. Is there any sort of equivalent for the coordinate encoder, or anything that looks off in how I’m doing this? I have a hunch that the CoordinateEncoder has the highest potential given that its the most appropriate, and I want to be sure I’m wielding it right! Thanks a million again!

I’m not sure about the specifics but it sounds like everything is working but the anomaly scores aren’t quite what you expect? The granularity could have a big impact. You can multiple by a number other than 100 to convert to integer coordinates to change the granularity, which may have a similar effect to what you were doing with the scalar encoder.

I would try a very course encoding initially (multiply the raw values by 5 instead of 100) and make sure that you see low anomaly scores for predictable sequences after they have been seen a few times. Then increase the precision (multiply by a larger number) as much as you can while still getting good results. Just keep in mind that at low precision, any remotely similar positions will be treated as the same so you will see many incorrectly low anomaly scores. But it could be that multiplying by 100 is either too high (or too low) to get get the right balance with the anomaly scores.

1 Like

Thanks Matt for the prompt answer.
I am afraid though that this bug isn’t fixed in version 1.0.1
Even when I stick to integer coordinates only and even when I don’t care about prediction but only try to run anomaly detection, I still end up with assertion fail: “Should be implemented in base classes that are not containers for other encoders”. Following the code line by line I see it is still due to the fact that getBucketIndices() of the parent base encoder receives a None object for self.encoder.encoders and thus throws the ‘assert False’ line
This is exactly the same bug pointed out by Austin_Marshall and I still reproduce it over and over again with version 1.0.1 unfortunately

1 Like

Can you please include the entire stack trace you are getting?

Did you also add the above? Not sure if it helps but try it.

Unfortunately changing clParams to None reproduces the same error.

Is there any code snippet showing that opf can work with CoordinateEncoder in either 1.0.0 or 1.0.1?

The code is so very basic:

import json
import pandas as pd
from nupic.frameworks.opf.model_factory import ModelFactory
import numpy as np

df = pd.read_csv('C:\data-science-projects-code\HTMwrapper\data_in.csv',
                 parse_dates=['dttm'], index_col='dttm')
col_name = ['col1', 'col2']
df.columns = col_name 

encoder_path = 'C:\data-science-projects-code\HTMwrapper\models\CE_2D.json'
with open(encoder_path) as fp:
    model_params = json.load(fp)
model = ModelFactory.create(model_params)

model.enableInference({'predictedField': 'col1'})
enc_radius = 10
for i in xrange(len(df)):
    record = {'dttm': df.index[i],
              'col1': (np.array([int(np.round(df.col1[i])), int(np.round(df.col2[i]))]),
                       enc_radius)}
    result = model.run(record)
    print result