Trouble using Category encoder C as special flag and with string datatype

Exception: E10002: Exiting due to receiving too many models failing from exceptions (6 out of 6).
assert types[self._resetIdx] == FieldMetaType.integer

Q1) I am getting this assertion error whenever I am using C (category encoder) as a special flag in data set for swarming ? And when I change the special flag to R or S then it is working fine .
Q2) I am getting the same assertion error while using string as datatype for any input field .

Can anybody help me in resolving these issues ?

1 Like

Please show us:

  • swarm description JSON file
  • small sample of data CSV, including headers

I am attaching below sample of my dataset

And regarding swarmdescription file . Actually I have automated everything so It is automatically reading everything from csv file and copy them to the required dictionary which I have checked correct only. As a result making it easy to run swam over any type of dataset

1 Like

Note that In 2nd column 1 represents N(north) and 0 represents S(south). So if I change this 1 to N and 0 to S with string as its datatype and no special flag then swarm ran successfully but it is not giving the expected results . Results are highly deviating from the actual values. I am not able to understand the reason .

Below I am attaching the output file and altMap ( prediction1 represent prediction after 1 timestamp and prediction2 represents prediction after 2 timestamp)

1 Like

It looks like you have marked two data fields as R for reset, but this doesn’t make sense. The values in those fields don’t look like they are actually resetting sequences properly, either. See NuPIC Input Data File Format for details about how reset works.

Ok, but there still must be a way to get that dict and print it to the screen during runtime. It probably isn’t right. I can’t even tell which field you are predicting without looking at the swarm file.

The swarming process creates model parameters, then those model parameters are used to create a model. What is your process? Are you swarming every time you run? I think you have something wrong at the core of your process… if I could see your code, it would help me debug it.

Actually I am not getting the exact meaning of using these special meaning and how to use them even after going through the source that you have mentioned . So it would be very helpful to me if you can elaborate when and for which type of field what special flag should be used?

Here column A,B,C,D,E are the inputs and column F is the output field to be predicted.
And column G is the prediction of column F for 1 timestamp ahead and column H is the prediction of column F for 2 timestamp ahead?

I am attaching this link containing the required files

I have not received any replies since 12 days . Please let me know if it is in an inappropropriate format ?

Sorry I did not get back to you sooner, but I’ve been traveling and been on vacation for the past week. Before I look at your code, I’d like to continue our conversation.

The R (reset) flag is outside of your data. It is a way for you to indicate that a sequence has ended. Every time NuPIC sees a 1 in that field, it indicates that some sequence in the data has completed and another will start. This should correspond with some logical break in input data sequences. For example, when analyzing vehicle movements, sequences might be reset after the engine turns off.

The S (sequence) flag has the same purpose, but a different method. Instead of indicating when a sequence should reset, this is a convenience to allow you to add an id to the input data to identify different sequences (track ids in GPS tracks, for example). Whenever this value changes in the data, NuPIC assumes a new sequences is occurring.

This is a row of input, not what we need to look at. We need to see the model parameters. These are used when creating a model. It is the dict you pass into a model when it is created. That is what we need to print out and see:

from pprint import pprint

Use the pprint module to get a nicely formatted output.