Can someone please review my .json and swarm data

Summary:

I’m the tic tac toe guy from a couple weeks ago. I’m working on moving my tic-tac-toe board identification problem to be a full game problem. To start I have been trying to use category encoders rather than scaler encoders for the value of each position on the board. This update is causing me problems though. My swarming definition updates have caused me to start hitting an assert down in NUPIC code. I’ve double checked my training data and my .json file, but I cannot see the error. Could I get someone to review my code and see if they see anything jumps out?

The call stack is this:

   Model Exception: Exception occurred while running model 2251: AssertionError() (<type 'exceptions.AssertionError'>)
Traceback (most recent call last):
  File "/home/mgstrein/.local/lib/python2.7/site-packages/nupic-0.5.7-py2.7.egg/nupic/swarming/utils.py", line 435, in runModelGivenBaseAndParams
    (completionReason, completionMsg) = runner.run()
  File "/home/mgstrein/.local/lib/python2.7/site-packages/nupic-0.5.7-py2.7.egg/nupic/swarming/ModelRunner.py", line 237, in run
    maxTimeout=readTimeout)
  File "/home/mgstrein/.local/lib/python2.7/site-packages/nupic-0.5.7-py2.7.egg/nupic/data/stream_reader.py", line 201, in __init__
    bookmark, firstRecordIdx)
  File "/home/mgstrein/.local/lib/python2.7/site-packages/nupic-0.5.7-py2.7.egg/nupic/data/stream_reader.py", line 297, in _openStream
    firstRecord=firstRecordIdx)
  File "/home/mgstrein/.local/lib/python2.7/site-packages/nupic-0.5.7-py2.7.egg/nupic/data/file_record_stream.py", line 237, in __init__
    FieldMetaType.integer)
AssertionError

test_search.json - this is long (sorry)

{
  "includedFields": [
    {
      "fieldName": "reset", 
      "fieldType": "int"
    }, 
    {
      "fieldName": "row_0_column_0", 
      "fieldType": "string"
    },
    {
      "fieldName": "row_1_column_0", 
      "fieldType": "string"
    }, 
    {
      "fieldName": "row_2_column_0", 
      "fieldType": "string"
    },
    {
      "fieldName": "row_0_column_1", 
      "fieldType": "string"
    }, 
    {
      "fieldName": "row_1_column_1", 
      "fieldType": "string"
    },
    {
      "fieldName": "row_2_column_1", 
      "fieldType": "string"
    }, 
    {
      "fieldName": "row_0_column_2", 
      "fieldType": "string"
    },
    {
      "fieldName": "row_1_column_2",  
      "fieldType": "string"
    }, 
    {
      "fieldName": "row_2_column_2", 
      "fieldType": "string"
    },
    {
      "fieldName": "winner", 
      "fieldType": "string"
    }
      
  ], 
  "streamDef": {
    "info": "data.csv", 
    "version": 1, 
    "streams": [
      {
        "info": "data.csv", 
        "source": "file://data.csv", 
        "columns": [
          "*"
        ], 
        "last_record": -1
      }
    ]
  },
    "inferenceType": "MultiStep",
  "inferenceArgs": {
    "predictionSteps": [
      1
    ], 
    "predictedField": "winner"
  }, 
  "iterationCount": -1, 
  "swarmSize": "small"
}

The first few rows of data.csv - the actual data is another 500 lines

    reset,row_0_column_0,row_0_column_1,row_0_column_2,row_1_column_0,row_1_column_1,row_1_column_2,row_2_column_0,row_2_column_1,row_2_column_2,winner
    bool,string,string,string,string,string,string,string,string,string,string
    R,C,C,C,C,C,C,C,C,C,C
    1,2,0,1,0,1,0,1,2,0,0
    0,0,0,0,0,0,0,0,0,0,1
    1,2,2,2,1,1,0,0,1,0,0
    0,0,0,0,0,0,0,0,0,0,2
1 Like

I was working with this more today, so I wanted to post a quick update. I noticed that if I move the reset column to the end of the row that I don’t hit the assert. I didn’t have time to really dig into the code to figure out why I was hitting the assert in the first place; I was just playing around with my input format for fun.

row_0_column_0,row_0_column_1,row_0_column_2,row_1_column_0,row_1_column_1,row_1_column_2,row_2_column_0,row_2_column_1,row_2_column_2,winner,reset
string,string,string,string,string,string,string,string,string,string,int
C,C,C,C,C,C,C,C,C,C,R
1,2,1,2,1,0,2,0,1,0,1
0,0,0,0,0,0,0,0,0,1,0
2,1,0,2,0,1,2,0,1,0,1
0,0,0,0,0,0,0,0,0,2,0

I still think you should write a TicTacToe Encoder and forget about swarming. We have some good docs on encoders.

I agree and thought I was moving int hat direction by changing my swarm to use category encoders rather than scaler encoders as a first step. Are you thinking that I should abandon the idea of swarming altogether, or would I come back to it once my encoder is complete?

Thanks for helping me understand this stuff Matt :slight_smile:

Yes, if I were you I would abandon the idea of swarming and build your own encoder. You want something you can send in a row of input data (in whatever format you want) that returns binary array. The array does not have to be sparse, but be sure to read this paper:

And also you can see examples of existing NuPIC encoders at nupic/src/nupic/encoders at master · numenta/nupic · GitHub. The DateEncoder is a good example of an encoder that joins multiple data input points into one representation.

Once you have an encoder, I would use the anomaly model params for the SP/TM and your own custom tuned encoder params for the new encoder. You might still be able to swarm to get the best params for your new encoder, but since you designed the encoder yourself, you probably have the best idea how to tune it. That’s why I would not suggest swarming if you’re creating your own encoder.