Self play has been a big success for Alpha Zero. Exceeding human level play in a few days. Is there a roll for self play in HTM? Can a HTM be trained to play a games with defined rules and then be set to train by playing itself?
HTM is currently missing a reinforcement learning capability, so by itself it would not be able to formulate or execute game strategies. In it’s current state, HTM could learn game states and make predictions about what will happen in the future (those predictions could include predicted reward). You’d need to combine HTM with other AI technologies to create a game playing system.
HTM models aren’t really trained in the traditional supervised sense. They simply observe input sequences and output predictions of that sequence. The parameter adjustments that do occur are based on the input sequence, and not derived error values.
BTW, if you haven’t seen it yet, you might want to check out this project which explores HTM + RL for a game agent. It could be a good start if you are hoping to continue this type of research. You can also join our “Dad’s song” working group, which has some similar goals, if you want to bounce ideas and brainstorm.
This question is a great question, but it exposes some confusion about what HTM is and is not.
HTM is a theory of spatiotemporal memory storage, which includes learning, inference, and prediction.
Learning is the Hebbian part within synapses between neurons. Its the whole “neurons that wire together fire together” thing. Patterns reinforce themselves over time as they are continually perceived. Learning also involves inherent properties of SDRs. The brain takes advantage of properties of SDRs to learn better and faster.
Inference involves representation, because if you have a representation of something you perceive in reality, you have to be able to compare it to all the things you’ve perceived in the past to infer what that thing actually is and how it relates to every other object you’ve learned.
Prediction means knowing how objects in the world move, what causes have what effects, how objects interact over time. Prediction requires temporal sequence memory over spatial data. We think it also requires sparsity.
Now when you think about RL you think about some signal into the system that inputs some type of score or feedback the system uses to tune its actions over time. This could exist outside the current realm of HTM theory.
rhyo thanks for the response. There is a ton of stuff packed in your post. I get the learning part. I sort of get the inference part but am a bit short on learning associated labels (names). On the last point prediction over time that I do not see in HTM nor in anyone’s systems.
I was reading this piece on AI in games and they describe a state machine that is fairly orthodox.
My though is that the HTM block could detect when play is deviating from some “normal” state and trigger a state change evaluation.
Along these same lines - every part of the block diagrams could be evaluated for anomaly detection as a possible case for HTM as a solution.
Yes, we are in the realm of active theory here. But much of what we’ve learned about temporal sequence memory prediction applies, so we have a lot to work with.
How about card games that require being good at predicting what might come next, like blackjack and poker?
Um, as long as you subtract the set of what cards have been seen predicting the next card is just a random selection from the collection of cards left in the set.
There is no valid method of predicting a well generated random number.
Keeping track of which cards are left, and how order changes from one shuffle to the next (depending on dealer’s less than random methodology) puts the odds in the player’s favor.
Here is an applicable example:
So you are proposing a tool to sell to casino owners to spot anomalies?
I suppose that you could tie it to the table cameras and spot things automatically.
I was thinking more like a robot add-in, to use at casinos that have no rules against allowing bots to play.
Since I find card games and such to be boring I have no interest in gambling. But a robot that is such a good player it has to be banned from ever playing again sounds like a lot of fun!