HTM is a Sequence learning system ? right?

TM is the core of HTM.
Which in essence is Sequence learning.
Anomaly detection is Sequence recognition.

As you can see below from the 4 possible problems of SEQUENCE learning, TM can solve the first 3 in deterministic, closed loop scenarios.
The questions are then :

  1. How does it solve problem 4

    • does it need implementation of the Basal-ganglia
    • you expect to solve this inside CC ? Does not seem feasible to pack this logic in the cortex !
      (reusing BG seems much more plausible, because all species learn at least reaction)
  2. How do you solve Stochastic sequences ?

  3. How do you solve open loop ?

I can see adding Hierarchy can “close” the loop 3 (;), pun intended, by predicting longer patterns of the sequence.
That still leaves 1 and 2.

also check :

These different sequence learning problems can be more precisely formulated
as follows (assume a deterministic world for now):

– Sequence prediction: s i , s i+1 , ...., s j −→ s j+1 , where 1 ≤ i ≤ j < ∞; that
is, given s i , s i+1 , ...., s j , we want to predict s j+1 . When i = 1, we make
predictions based on all of the previously seen elements of the sequence.
When i = j, we make predictions based only on the immediately preceding
– Sequence generation: s i , s i+1 , ...., s j −→ s j+1 , where 1 ≤ i ≤ j < ∞; that is,
given s i , s i+1 , ...., s j , we want to generate s j+1 . (Put in this way, it is clear
that sequence prediction and generation are essentially the same task.)
– Sequence recognition: s i , s i+1 , ...., s j −→ yes or no, where 1 ≤ i ≤ j < ∞;
that is, given s i , s i+1 , ...., s j , we want to determine if this subsequence is
legitimate or not. (There are alternative ways of formulating the sequence
recognition problem, for example, as an one-shot recognition process, as
opposed to an incremental step-by-step recognition process as formulated
With this formulation, sequence recognition can be turned into sequence
generation/prediction, by basing recognition on prediction (see the chapter
by D. Wang in this volume); that is, s i , s i+1 , ...., s j −→ yes (a recognition
problem), if and only if s i , s i+1 , ...., s j−1 −→ s pj (a prediction problem) and
s pj = s aj , where s pj is the prediction and s aj is the actual element.

- Sequential decision making (that is, sequence generation through action-
s): there are several possible variations. In the goal oriented case, we have
s i , s i+1 , ...., s j ; s G −→ a j , where 1 ≤ i ≤ j < ∞; that is, given s i , s i+1 , ...., s j
and the goal state s G , we want to choose an action a j at time step j
that will likely lead to s G in the future. In the trajectory oriented case,
we have s i , s i+1 , ...., s j ; s j+1 −→ a j , where 1 ≤ i ≤ j < ∞; that is, given
s i , s i+1 , ...., s j and the desired next state s j+1 , we want to choose an action
a j at time step j that will likely lead to s j+1 in the next step. In the reinforce-
ment maximizing case, we have s i , s i+1 , ...., s j −→ a j , where 1 ≤ i ≤ j < ∞;
that is, given s i , s i+1 , ...., s j we want to choose an action a j at time step
j that will likely lead to receiving maximum total reinforcement in the fu-
ture. The calculation of total reinforcement can be in terms of discounted or
undiscounted cumulative reinforcement, in terms of average reinforcement,
or in terms of some other functions of reinforcement (Bertsekas and Tsitsiklis
1996, Kaelbling et al 1996, Sutton and Barto 1997).

The above exposition reveals the relationship among different categories of se-
quence learning, under the assumption of a deterministic world. Another assumption in the above discussion is that we considered only closed-loop situa-
tions: that is, we deal only with one step beyond what is known or done.
1 Like

How does it rearrange your mental furniture to change the definition of TM to say that it remembers transitions between local states?

Transition Memory.

Most of the problem you are outlining are related to longer sequences.

If you back down to:
(this state) expects (that state) or (burst to signal novelty)
most of the problem you raise fade away.

1 Like

hmm… i still can’t imagine how does Transition memory solves the RL problem.

Any time i think about it I jump to using some kind of score to PREFER different path.

Cant see how to integrate States and Actions as Transition… yeah this seems to be my biggest hurdle !

State => State is OK


State => Action => State (this one for planning)

like :
S-SDR => A-SDR : Q? => S-SDR (no longer a transition)
always seems to need a score ! so you can make a decision.
SDR does not have scores/values/Q’s

Said in other way Transitions are good for learning trajectory, but cant solve decision making if you want to change the trajectory ?

1 Like

If you try to make cortex to the RL then yes, I see how there is no good answer.
Cortex learning of transitions is very local learning and knows little or nothing about goals.
It learns to model the world and signals when something novel is happening.

If my understanding is correct, you are correct, the “older” subcortical sections of the brain drive states and goals, and reinforcement learning.

and in this post:


In this post I describe how judgement colors memory to add “embodied judgement”

More on “baked in” judgement:

Here is the same mechanism set in the larger context:

1 Like

Enough of a data dump?


:wink: thanks … how would you approach connecting TM with BG. Like you say TM is local … In the RL lingo I can see it like the model, but then again I start thinking about Rewards this time ;( … all thoughts lead me to use some sort of real number. I seem unable to see a way to do SDR-desicion making, may be because I think on SDR-level, rather than on neuron level

1 Like

Please go the other way and think system level; I have been preaching this here as the way forward for the last few years or so.

See my last few post in this thread.

1 Like

I’ve always found the Ladder of Causality from Judea Pearl as a good outline of the cognitive complexity of the problem you are trying to solve, and the pieces that need to be built to solve them.

The three levels are Association, Intervention, and Counterfactuals. HTM can solve the Association level since it is just an associative learner and it does it well. Doing something like RL requires implementation of the Intervention level which is the ability to act on the world and associate the responses with your actions. This requires an decision-maker component, a goal-based evaluation, and a supervised learning capability to tie the actions to the outcomes. This is something that HTM doesn’t explicitly have yet.

The Counterfactual level is more subtle but entails understanding what would happen if you didn’t do something, which leads to common sense understanding of actions and how the world works. This last part is the hardest, but I think not relevant until you understand and implement the Intervention level.


Are you proposing that a mouse does all that?
Stepping back from mammals, do lizards do any of that?

A theory of brain function should apply to all critters that have similar brains

1 Like

I don’t think I was making any biological claim, but merely a practical machine learning claim.

However, I think all animals with some capacity to learn have can make associations from operant conditioning which satisfies the first level of causality of “Association”.

Those animals that can map their own actions to outcomes are those that satisfy the second level of “Intervention”.

Finally, those animals that can map the outcome of not performing action are those that satisfy the third level of “Counterfactual”.

Again, I wasn’t really trying to make any claims about biology. I was more interested in the practical architectural problem of building a cognitive system that satisfies some of the properties we take for granted in humans. This framework is an useful roadmap for identifying which problems you need to solve.