Version Info
I am using NUPIC version 0.5.7 with nupic.core version: 0.4.13
Overview:
I am trying to get NUPIC to identify tic-tac-toe boards that I pass to it. However NUPIC does not seem to be able to converge on the rules for tic-tac-toe. I need help figuring out a way to debug this.
Details:
My plan is to send NUPIC data in pairs of rows, where the first row is the serialization of a game board, and the second indicates the player that won the game. I am hoping that NUPIC can learn the basic set of rules for tic-tic-toe by doing this, however that has not been the case. Everything seems to be working but NUPIC never seems to converge and its guesses about what board is winning is pretty much random. Over time it never gets any better.
A board can be serialized into a single row of values. These values are a ‘reset’ column, a column for each of the nine positions on a tic-tac-toe board and finally a column that indicates the ‘winner’ of the previous row. A board and its identification can then be expressed with a pair of rows. The first of the two rows has the reset bit set in the first column, the values of the board in the following nine columns, and zero in the winner column. The following row is all zero’s except for the winner column, which indicates the winner - zero, player 1, or player 2.
My expirment begins by creating a few hundred sample boards. These boards with their results are writtent to a .csv file. I swarm over this csv file to produce a model. Using this model, I feed it more randomly generated boards and their results. I was hoping that after a certain amount of time, given a board my model would always predict the correct value of the winner column in the row that would come next - the result row. However what I am seeing seems to be more or less random. The model almost always predicts, player one, though not always. And it will sometimes make a correct assesment of a board where player 2 wins. However when I track results over time, it does not improve and the guesses appear to be just about random.
I believe that learning is occuring, as when I explicitly disable it, there is no intelligence at all. It will just predict zero - which is the default value. Not Player 1 or player 2.
Request for Help
I’ve tried to alter parameters somewhat, but it feels like I’m just kind of guessing, changing the code and hoping that I get a better result. I’m not really sure how to debug this. I’m not sure why my model is making the guesses that it is. Does anyone here know of a good way to figure what is going on behind the scenes. I’d like to be able to understand why it sometimes will guess player 2 and get it right. I’d also like to figure out what I am screwing up in either my training data or the way I’m creating the model as it seems that this problem should be very simple for NUPIC to figure out.
Sample Data:
This is some of the data that I am getting from my expirments. I’m hoping that someone can take a look and see something that Inot in terms of things that are obviously wrong, and help me to identify a good vector to continue investigating this issue.
Here is one of the boards my code generated. As you can see it should be a player 2 win.
[1, 1, 2]
[1, 2, 2]
[2, 1, 0]
Here is the model result from when I serialized this board and fed it into my model:
ModelResult( predictionNumber=142
rawInput={'reset': 1.0, 'row_2_column_2': 0.0, 'row_2_column_1': 1.0, 'row_2_column_0': 2.0, 'row_0_column_1': 1.0, 'row_0_column_0': 1.0, 'row_0_column_2': 2.0, 'winner': 0.0, 'row_1_column_2': 2.0, 'row_1_column_0': 1.0, 'row_1_column_1': 2.0}
sensorInput=SensorInput( dataRow=(0.0,)
dataDict={'reset': 1.0, 'row_2_column_2': 0.0, 'row_2_column_1': 1.0, 'row_2_column_0': 2.0, 'row_0_column_1': 1.0, 'row_0_column_0': 1.0, 'row_0_column_2': 2.0, 'winner': 0.0, 'row_1_column_2': 2.0, 'row_1_column_0': 1.0, 'row_1_column_1': 2.0}
dataEncodings=[array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)]
sequenceReset=0.0
category=-1
)
inferences={'multiStepPredictions': {1: {0.0: 0.012589512422763568, 1.0: 0.91143423257015377, 2.0: 0.075976255007092777}}, 'multiStepBucketLikelihoods': {1: {0: 0.012589512422763568, 254: 0.075976255007092777, 127: 0.91143423257015377}}, 'multiStepBestPredictions': {1: 1.0}, 'anomalyScore': None}
metrics=None
predictedFieldIdx=0
predictedFieldName=winner
classifierInput=ClassifierInput( dataRow=0.0
bucketIndex=0
)
)
This is the model result that I fed into NUPIC after it was 91% confident that player 1 should have won. You can see that I set the winner value to be 2.0
ModelResult( predictionNumber=143
rawInput={'reset': 0.0, 'row_2_column_2': 0.0, 'row_2_column_1': 0.0, 'row_2_column_0': 0.0, 'row_0_column_1': 0.0, 'row_0_column_0': 0.0, 'row_0_column_2': 0.0, 'winner': 2.0, 'row_1_column_2': 0.0, 'row_1_column_0': 0.0, 'row_1_column_1': 0.0}
sensorInput=SensorInput( dataRow=(2.0,)
dataDict={'reset': 0.0, 'row_2_column_2': 0.0, 'row_2_column_1': 0.0, 'row_2_column_0': 0.0, 'row_0_column_1': 0.0, 'row_0_column_0': 0.0, 'row_0_column_2': 0.0, 'winner': 2.0, 'row_1_column_2': 0.0, 'row_1_column_0': 0.0, 'row_1_column_1': 0.0}
dataEncodings=[array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], dtype=float32)]
sequenceReset=0.0
category=-1
)
inferences={'multiStepPredictions': {1: {1.0: 0.0039215686274509803, 2.0: 0.99607843137254592}}, 'multiStepBucketLikelihoods': {1: {254: 0.99607843137254592, 127: 0.0039215686274509803}}, 'multiStepBestPredictions': {1: 2.0}, 'anomalyScore': None}
metrics=None
predictedFieldIdx=0
predictedFieldName=winner
classifierInput=ClassifierInput( dataRow=2.0
bucketIndex=254
)
)