Playing Rock, Paper, Scissors with HTM, Take 2

This is my second attempt at making HTM playing rock paper scissors against neural network (link to my last attempt). This time I’ll up the game from tiny-dnn vs NuPIC.core to PyTorch vs Etaler.

The project

Same as what I did the last time, the project is to have a neural network play rock paper scissors against HTM and see who wins. And if both algorithms are sufficient at predicting what their opponent’s move, both network should fall into Nash equilibrium and have the same winning rate. (Note: this does not mean both network will generate random predictions. Thus the draw rate may or may not be 33% percent).

In the experiment, the neural network model is a simple LSTM network followed by a fully connected layer. Trained with BCELoss and SGD. And the HTM side has a grid cell encoder, a Temporal Memory and a CLAClassifer to decode the prediction.

The NN Model

self.rnn1 = nn.LSTM(3, self.hidden_size, self.num_lstm_stack)
self.fc1 = nn.Linear(self.hidden_size, 3)

And the HTM model

self.tm1 = et.TemporalMemory(input_shape, cells_per_column)
# SDRClassifer in Etaler is CLACLassifer in NuPIC
self.sc1 = et.SDRClassifer(input_shape, 3)

Then both networks plays against each other, learning the opponent’s mode each step. We record the outcome of the results and analyze them.

Results

I don’t know why the result is so different from my last attempt. But HTM totally beats LSTM.
Figure_1

Even after 10x more games, LSTM still can’t catch up with HTM.

I guess this we where my side project ends. Maybe someone knows more about how LSTM works can explain why it is performing so poorly? Maybe we can modify the LSTM’s parameters so it works as well as HTM? Source code available here.

Note: How did I interface Etaler in Python without a Python wrapper?

ROOT, a data analysis framework developed by CERN supports parsing C++ headers and expose them in python automatically. The process is far from perfect, but it works well enough for a side project.

7 Likes

This is interesting. From the charts I can only guess LSTM would require a few million rounds to get even at predicting what opponent will do.

Did you count their actual moves (besides who won), to see whether some glitch makes it play one move more/less often than 33% ?

1 Like

When in that 200K sequence did the TM get some reinitialization steps ? Was it, to the contrary, fully in one go ? If so I don’t really understand this result.

1 Like

No, the TM is trained on the fly and never reinitialized. I’m thinking if it is caused by Etaler using a different synapses growing algorithm comparing to NuPIC.

Yes, this seems likely. Have you run a competition between NuPIC.core and Etaler?

Another interesting match would be PyTorch vs tiny-dnn.

2 Likes

I’ll tackle it when I have more free time :slight_smile:

Unfortunately that’s very hard. tiny-dnn is way too big for ROOT to JIT the entire library. It takes like 3GB of RAM and a few minutes just to load tiny-dnn into ROOT.

2 Likes

I’ve been thinking about this, and it’s certainly not obvious why the different synaptic growing algorithm would have such a major advantage when used for this particular task. I figured I would share some thoughts I had which might be relevant. Of course the other likely possibility could be some problem with the PyTorch configuration.

The first obvious thing to point out is that none of these algorithms are playing the game in quite the same way as most humans would. They cannot rely on psychology or strategy… they are simply trying to recognize and predict a sequence of moves that they have previously seen the other side perform. Their output is hard-coded to take each predicted next move and perform its counter move, thus forming a new sequence that the other side must learn.

With this in mind, I would have predicted that HTM should have a clear advantage in this type of match, since it is capable of learning new sequences much faster than most classic ML algorithms. The classic algorithm would need to run through the same sequence a greater number of times to learn it, during which HTM would be correctly predicting many of its moves.

Strangely, that is not what we saw in the first round though (tiny-dnn VS NuPIC.core). NuPIC had only a slight advantage over tiny-dnn. Why were the results in this round so different? I’m thinking it is related to the disambiguation process, and some advantages that Etaler has over the classic TM algorithm in this area.

Etaler grows a single distal segment per cell, and that one segment always connects up to “max synapse per cell” in each learning step. In the classic TM algorithm, this is roughly equivalent to setting “max segments per cell” = 1 and “max new synapse count” = “max synapses per segment”. I suspect that this will result in a reduction in the number of different contexts that a given cell can be a part of (compared to the classic TM algorithm). Instead, a cell in a given context will more readily mutate into a new context, forgetting its previous contexts.

A distal segment in Etaler will also grow synapses with any of the previous active cells (not just previous winner cells). And Etaler doesn’t have an equivalent logic from classic TM where if “max new synapse count” synapses are active, it will not grow new synapses even if the count is less than “max synapses per segment”. These differences should also contribute to a cell more easily mutating into a new context.

The result may be that Etaler does not remember as many old contexts. Thus, when it encounters point errors as PyTorch is adjusting its model, it requires fewer inputs to disambiguate compared to NuPIC. Perhaps as a result, it may be able to begin learning the new pattern that much sooner.

3 Likes

Did you consider the possibility that the classical NN is simply defeating itself, as if trying to predict against a RNG ?
I’m far from understanding the specificity of etaler, and can’t beat Paul at such analysis, but (intuitively at least), to me nothing in vanillla HTM would seem to be a good fit for such a game.

Now, I’d bet any RNN, to the contrary, could become quite good against basic human psy.
As a side note, I’d have put both sides plays into the input stream

That will be the Nash equilibrium of such a game. If both NN and HTM are sufficiently powerful, they should both end up being an RNG and no one can predict the opponent’s next move any better than guessing.

To me, HTM seems ideally fitted for this type of scenario. HTM is playing the game simply by recognizing sequences of moves that it has seen before. This should give it at least some advantage over another agorithm which is slower at learning new sequences. The slower-learning algorithm has to see a pattern more frequently to learn it than HTM does, and so while it is going through those learning cycles, HTM will learn and correctly predict more of its moves than if it were a simple RNG. HTM doesn’t have to guess every move correctly, just more frequently than it would by random chance. Every time it is wrong about what sequence it is seeing, minicolumns burst and it then disambiguates to another matching sequence, or starts learning a new one.

3 Likes

Yes, I think this is pretty much the case for PyTorch vs Etaler. But more interesting is the fact that, while the RNN is “defeating itself”, the HTM system is doing spectacularly well at predicting the RNN’s moves (i.e. the RNN is more predictable than a RNG). HTM is clearly better suited for predicting how an RNN will model it, than an RNN is at predicting how HTM will model it.

It is also important to note that these experiments do not indicate RNN is simply unable to play RPS. RNN vs RNN for example shows a clear drop in the draw rate:

One of the other experiments from the previous round (tiny-dnn VS NuPIC.core) I think also supports the idea that HTM is just learning and updating its model faster than the RNN can keep up with. As soon as HTM learning was turned off, the RNN had a chance to accurately model it. Prior to that, it was just too fast of a moving target.

3 Likes

Why would it try to win? What would be the motivation to pick one move over another. Picking a winning move is just as likely as picking a losing or draw move.
It there was a way to cull the losing sequences … well … this is not something that HTM can currently do.
I suppose you could feed it all possible winning and draw moves as a training set with learning on, then turn learning off.

1 Like

It isn’t trying to win. It is trying to predict what the other side is going to do. The “try to win” part is hard coded. Whatever it predicts the opponent will do next is fed into a simple bit of code that generates a counter move. The system is not modeling itself. It is modeling the opponent.

6 Likes

My bad, I was thinking about tic-tac-toe and lost the rock-paper-scissors thrust of the thread.

This thread assumes that there is some internal decision model to be learned if some strategy is being employed and that this is exhibited as a recogonizable sequence. Learning to pick random numbers would be as effective a choice as any strategy and would be very hard for HTM to predict.

3 Likes

True, adding in an RL algorithm to model state/action/result would be able to intentionally increase the draw rate whenever it could no longer predict the HTM moves.

1 Like

Holy. Paul you are really good at writing. It’s a joy reading your posts. :smiling_face_with_three_hearts:

Ohh… I didn’t know that. I should read the white paper and BAMI more closely.

For anyone doesn’t know, this is how Etaler solves the repeating input problem.

4 Likes

Oops, sorry that was a bad typo (I’ll fix it). This should have been “max new synapse count”, not “activation threshold”. Etaler skips this because it doesn’t distinguish “max new synapse count” from “max synapses per segment” (it has instead a “max synapses per cell”)

1 Like

@marty1885 I am very interested in comparing the multiple timestep prediction performance of HTM vs tiny_dnn LSTM for a simple sinus wave data or any combination of sinus waves at different frequences.
Do you have any experiments like that?

2 Likes

No, but I could make one. :smiley: Any details that you can share? (Maybe in a different/new topic?)

Tho I’ll predict HTM will outperform tiny-dnn. Neural networks are not good at continuous learning; they tend to forget stuff doing so.

1 Like

@marty1885 if you give me your email we can discuss and share what you need. In my experiments with HTM.core for very simple sinus wave prediction I found:

  1. By a chain TM->TM->TM…, se can only predict max 3 step into the future.
  2. Current SDRClassifier does not works well for prediction. I am working on improvements now.
1 Like