A TM at time t is simply trained to predict its input at time t+1.
From ML perspective, this is very restrictive.
Let’s take reinforcement learning (RL). Which, if someone aims to replicate biology should not ignore. Because RL is attempting to produce agents which learn/optimize behavior within their environment. Like biology does.
For this purpose past-dependent prediction abilities of TM have great potential.
In RL you have an observation as input and agents must (learn to) respond with an optimal action, which maximizes a future reward.
The three - observations, actions, rewards - are semantically different. There are various strategies in RL, the problem is most of them do not care about explicitly predicting the input at t+1, they try either to predict directly a “good” action or to search for an optimal action by by estimating the values of various actions in the context of current observation.
Only In some models the algorithm predicts next input (observation) from current observation and potential action which is later valued.
The only way one can use TM for RL is to input it either
- (observation, useless prev action)
to recommend
(next observation, next action) -
(observation, potential action, past action value)
to predict:
(ignored observation, step-ahead action, next value) - In two steps:
(observation, action) → (next observation, future action)
then
(next observation, past value) → (next next observation, value)
With bold above is marked what is either needed as input or desired as output and italics for either I-wish-I-could-ignore inputs or have-to-ignore outputs.
Because a faithful TM must obey the biological model of predicting only its inputs and to have as input signal for every of its output signal.
The usual response to this problem is “so what, put everything as both input and output and there you have, you can do RL with a nice, biologically faithful, square TM”
But the problem with that is observation vector can be orders of magnitudes larger than action vector or the single scalar value.
The majority of columns are wasted (computation and memory) to predict something that isn’t actually needed by the agent.
Algorithmically/computationally there is no actual constraint against some other topology than the strict rectangle of rows&columns.
When one asks “Why isn’t HTM more popular in ML/AI?” - here-s one good reason. AI is resource demanding for faith reasons instead of algorithmic ones.
And now the big question mark in the title:
What if the actual purpose of a column is not to predict its own future input?
Imagine you are god or nature trying to build a brain with wires and electric signals (aka neurons). For that you figured out columns which take several inputs as cues in order to predict:
- “I think I’ll see a cup!”
Why would it need a following reply stating:
- “Yes you now see a cup!”
? Makes sense ?
And there is simple reason for that is when your mind is built up exclusively of electric wires, you have to physically wire the reinforcing signal back to whoever made the correct assessment.
The actual dialogue might be:
- “I think I see a cup!”
- “good boy!”
TLDR what we assume as input in biological columns could be just a reinforcing (learning) feedback simply because it needs one in order to improve its prediction.
A high five for taking a good shot.
In software we don’t need that – we can directly update synapses based on correct/incorrect predictions of corresponding neurons/cells/columns