Does the field name impact the anomaly/prediction results?

Hi, I am playing with the nupic opf hotgym anomaly example, and I found that changing the field name “kw_energy_consumption” to “value” when I feed the model into the engine actually makes very different results, why is that?

1 Like

Where are you changing the field name? In OPF model params?

yes, in model_params (replace kw_energy_consumption to value) and also in run.py, model.run(“value”:xxxx, …) and model.enableInference({“predictedField”: “value”})

Another thing is, if you change to timestamp but feed in the same data sequence with the same order, the results are different too. (even this makes more sense than changing the field name, but could you explain why is this as well?)

Changing the timestamp is definitely going to change the prediction because the pattern is getting scrambled. But changing the field name should not change results.

I think so too. Could you reproduce the same results when you change the field name? I actually tried several names since I found this issue, looks like so far only the “value” costs trouble. Maybe nupic is using the name “value”?

I wonder if anyone else can duplicate this?

Take a NuPIC OPF model you’ve used in the past and change the field name in the places @rainyyun noted above. Save the results in another file and compare. What do you see?

Ok that was the last gif seriously.

1 Like

Just to clarify, when I said changing the timestamp, I mean doing something like 2010-07-02 00:00:00 to 1970-07-02 00:00:00, just changed the year.

Ok. This is only meant as a hint to steer folks in what may be the right direction, but I’m not endorsing this. But, I seem to remember while working with the NAB (briefly before other things blew up in my face), that value may be a “templated” field that gets swapped for a field name in the csv data file? I may be wrong about this, but I think I remember the inner workings doing something like this. So “value” being kind of a “reserved” field name, may account for the varying behavior your seeing?

I believe @lscheinkman or @alavin can maybe shed light on this?

Depending on your params you may be using a timestamp encoder that considers the day of the week, so a weekday encoding would be different from that of a weekend. Swapping the years is likely to scramble this.

1 Like

The field name defined in the encoder params must match the intended header of the CSV file.

do you mean the header of the rec-center-hourly.csv? What if I don’t have a csv file, I read the data from a queue.

@rainyyun You described feeding data row by row programmatically, so you set the field name in each input row dict. Alex is talking about another method of reading data via a file stream (I believe). This would apply if you were swarming.

1 Like

Right. I set the field name in each input row dict.