Is there a Basal Ganglia theory equivalent to HTM?


I’m curious to know if there are any basal ganglia researchers or theories that Jeff or anyone else at Numenta would recommend as a companion to HTM for the neocortex.


This is a good question!

Since one of the “key” tasks of basal ganglia is motor cortex output modulation, it looks like a really important piece in order to achieve practical sensory-motor inference.


I know almost nothing about basal ganglia. It would be great if someone wanted to summarize the popular theories around the structure and how they might be relevant to HTM.

In some sense the functionality of the cortex should not be too dependent on the details - the cortex just knows that it sends and receives motor commands to/from some generic downstream motor area.


Coincidentally I was recently looking through Garcia-Munoz and Arbuthnott. Basal ganglia—thalamus and the “crowning enigma". 2015., which has some interesting info on basal ganglia and thalamocortical dynamics for motor processing.


I wouldn’t say any theories are equivalent to HTM, and I’m not sure what that would mean anyway, but here’s some of the stuff I found when I was reviewing biological reinforcement learning that was most interesting to me. A lot of it is intricately tied to models of the prefrontal cortex since that appears to be the main participant in BG<->cortical interaction.

[1] O’Reilly, Randall C., and Michael J. Frank. “Making working memory work: a computational model of learning in the prefrontal cortex and basal ganglia.” Neural computation 18.2 (2006): 283-328.

Stable high-level representations in the PFC condition the sub-policy executions in the basal ganglia, part of what is learned is the gating of these high-level representations in and out of working memory (PFC).

[2] Jin, Xin, and Rui M. Costa. “Shaping action sequences in basal ganglia circuits.” Current opinion in neurobiology 33 (2015): 188-196.

Chunking of behavior into sub-policies by start and stop signals originating in the striatum.

[3] Collins, Anne GE, and Michael J. Frank. “Opponent actor learning (OpAL): Modeling interactive effects of striatal dopamine on reinforcement learning and choice incentive.” Psychological review 121.3 (2014): 337.

D1 receptors (sensitive to dopamine increases) and D2 receptors (sensitive to dopamine decreases) implement Go and NoGo signals that are real-valued and enable competition (voting) for actions.

[4] Frank, Michael J., and David Badre. “Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: computational analysis.” Cerebral cortex 22.3 (2012): 509-526.

Structural organization in the PFC with more anterior regions representing increasingly abstract states.

[5] Schroll, Henning, and Fred H. Hamker. “Computational models of basal-ganglia pathway functions: focus on functional neuroanatomy.” Frontiers in systems neuroscience 7 (2013).

Recent review of a lot of BG/RL work, including some of the other papers in this list.

[6] Baladron, Javier, and Fred H. Hamker. “A spiking neural network based on the basal ganglia functional anatomy.” Neural Networks 67 (2015): 1-13.

Detailed spiking model implementing direct (D1), indirect (D2), and hyperdirect pathways, where the first stimulates actions, the second suppresses actions, and the third enforces a winner-take-all competition.

[7] Mello, Gustavo BM, Sofia Soares, and Joseph J. Paton. “A scalable population code for time in the striatum.” Current Biology 25.9 (2015): 1113-1122.

A model for temporally extended representations related to behavioral sequences.

[8] Gershman, Samuel J., Ahmed A. Moustafa, and Elliot A. Ludvig. “Time representation in reinforcement learning models of the basal ganglia.” (2014).

A review paper specifically dealing with proposed representations of time.


I think that the basal ganglia has many “functionalities”, but the one that could be considered now (in the context of sensory-motor inference) is the motor-loop. I think that, that this one is well understood. Others, such as the cognitive loop, are much harder. They seems to be involved in many mental conditions such as autism, squizofrenia, addiction, etc…

The motor loop gets inputs from the motor cortex and acts over the motor thalamic nucleus (Ventral Lateral). My understanding is that the main task of the motor loop is to fine tune the descending motor cortex outputs. There is a couple of pathways controlled by dopamine receptors: the direct pathway and the indirect pathway. Both use really neat tricks to focus the movement by disinhibiting (direct pathway) the activation of pattern generator that has to produce the movement and inhibiting (indirect pathway) the surrounding pattern generators. The circuit is amazingly elegant.

When dopamine producing cells fails (are in other part of the brain called), both pathways are interrupted and produce Parkinson disease. It seems like that the motor cortex activations can’t focus the central patter generator (activating the surrounding ones, which produces the random movement that characterizes the disease).


Thanks - that’s a great list. I found the reference below to be particularly interesting. Gives some insights into how cortex might control sub-cortical behaviors.


Haven’t been on here in a while. I happen to have a pretty good understanding of HTM theory and some knowledge of neuroscience. I’ve got my own ideas on how it might interact with the cortex. Maybe if I’m not completely wrong, I can help point you guys in the right direction, or at least bring up some interesting talking points.

So on the BG:

A Few Things I Know:

  • The basal ganglia mainly consists of the striatum, globus pallidus (internal and external parts), and a few other small nuclei.
  • The striatum gets a lot of input from the cortex (from an occulomotor area in the occipital lobe, and from pretty much the entire frontal lobe).
  • The striatum also gets inputs from a few dopaminergic nuclei. It then sends outputs along two different pathways; one excited by the dopamine, one inhibited by it. The excitatory pathway (called the direct pathway) outputs directly to the Internal Globus Pallidus (GPi), and the inhibitory pathway (the indirect pathway) outputs to the External Globus Pallidus (GPe), which then outputs to the GPi. The GPi then outputs to the thalamus, which relays signals back the the cortex.
  • GPi and GPe only produce inhibitory outputs and don’t appear to do any learning. The inhibitory outputs (including those to the thalamus) end up being disinhibitory, meaning that an inhibitory signal from one area in the basal ganglia to another ends up leading to an excitatory signal being sent somewhere else. For example, an inhibitory signal from teh GPi to the thalamus ends up sending an excitatory signal to the cortex.
  • There are a few other minor pathways, but those are the main two.
  • The frontal lobe is structured in a hierarchy (limbic system at the base, motor at the top, and various prefrontal areas in the middle, which feature a lot of connections to non-frontal areas), and each part has an associated part of the basal ganglia it communicates with. The output from the thalamus back to the cortex feeds back into the same part of the cortex that the input originated from, forming a loop.
  • The striatum is known to be active in learning, the other areas of the basal ganglia, not so much. The dopaminergic inputs to it are very strongly connected to the limbic system.
  • Damage to the basal ganglia tends to have few immediate effects, but in the long term leads to erratic behavior. For example, Parkinson’s and Huntington’s diseases.
  • Motor output appears to be based on population coding; i.e, a large number of neurons control the same muscle, and the more neurons that fire, the more the muscle contracts.

What I Don’t Know:

  • How exactly does the striatum learn? Does it use some form of sequence memory? Something related to spatial pooling? Something else?
  • How precise are these loops (Cortex -> Striatum -> GP -> Thalamus -> Cortex)? Does the thalamus excite the same group of neurons that triggered it (my guess is yes)? If so, is it the same column? Same minicolumn? Same neuron?

What I Think is Going On

This won’t be everything I think is going on; just a basic summary of the most important ideas.

The frontal lobe is structured in a hierarchy. The base of the hierarchy consists of various areas of the cingulate cortex, which gets inputs primarily from the limbic system and control high-level goals (stay happy, avoid pain, etc.). The next level up appears to be the orbitofrontal cortex, which gets inputs from the cingulate cortex, as well as the amygdala and a couple other areas. It’s known to be involved in moral decision making. The next couple levels are various prefrontal areas, such as the dorsolateral prefrontal cortex. These areas have very strong connections to pretty much every other area of the brain, especially high-level areas in the temporal and parietal lobes. It’s known to be where the Where/What sensory pathways in the cortex seem to converge. These areas are known to be very involved in abstract thinking, decision making, working memory (they’re strongly connected to the hippocampus too), and strategic thinking. They then feed into the premotor areas (which are very strongly connected to the high-level somatosensory cortex), and then into the motor areas (strongly connected to the low-level somatosensory cortex).

Every level of the hierarchy I just mentioned has strong connections with the basal ganglia.

What I think is going on in the frontal lobe is that abstract goals originate in or near the limbic system, and the higher levels of the frontal hierarchy work to convert them into increasingly specific and flexible goals, gradually incorporating information from sensory regions as appropriate.

So what is the basal ganglia doing? Reinforcement learning.

What I think is going on is that the limbic system is constantly determining if the inputs are producing results that meet the goals (be happy, avoid pain, etc). If they are, they send positive reinforcement to the striatum, and if not, negative reinforcement. However, I believe the striatum is learning two representations of the output of the cortex; one that represents patterns associated with positive reinforcement, and one that learns patterns associated with negative reinforcement. The positive patterns output to the direct pathway, and the negative to the indirect pathway. The output is a pair of SDRs, each likely a subset of some high-level activity in the cortex.

The globus pallidus then appears to run the biological equivalent to a bitwise AND operation on the SDRs, with the input from the indirect pathway inverted. Essentially, it only allows bits through that are associated with positive reinforcement, and not associated with negative reinforcement (as it’s conceivable that a bit in the SDR could be associated with both). Because of how the logic works, a negative of this SDR is sent to the thalamus, and the thalamus then inverts it, and sends the signals to the cortex as a bias signal.

TL;DR : The cortex sends an SDR to the striatum. The striatum splits it into two SDRs; one featuring only the bits associated with positive reinforcement, and one featuring only the bits associated with negative reinforcement. The basal ganglia does some bitwise operations on the SDRs and the Thalamus sends it back to the cortex to bias activity in columns (or minicolumns, etc… Not sure about the granularity) associated with positive behavior, cancelling out any that have been associated with negative behavior.

An interesting side effect of all of this is that this should mean that there will be a large number of patterns that the frontal lobe is biased to avoid learning. That, and the higher levels of this hierarchy likely become organized to be closely associated with the motor map. If you remove this bias, it seems as though the cortex would no longer have anything keeping it from learning these patterns, as well as nothing to keep it organized. Seeing as neurons in HTM associate themselves with patterns randomly, this would suggest that damage to the basal ganglia would over time lead to very erratic output, and the brain would eventually begin to lose fine control over the body. Very much like what you see in diseases of the basal ganglia, like Parkinson’s and Huntington’s.

Any feedback would be appreciated! I’d love to work with you guys if I had time; HTM theory really interests me. Unfortunately I’m pretty busy right now. I’ve got a compiler I’m working on for a really ambitious programming language, and I’m working at a startup that’s just getting into the prototyping phase. Hopefully I’ll have some time soon. Maybe I’ll write an HTM implementation to test the compiler at some point. We’ll see.

Proposing a Model for the Basal Ganglia and Reinforcement Learning in HTM
Why is HTM Ignored by Google DeepMind?

Is that correct? At least in the context of motor loop, my understanding was that BG just inhibits/deshinibits L5 motor neurons in motor cortex (Bezt cells). Those guys are connected down to the in the dorsal spine nuclei. Those nuclei “decodes” Motor cortex commands into low level actions over the muscles. I guess I need to learn more :slight_smile:


Well that’s my interpretation of it anyway. I’m mostly looking at the connections inside the basal ganglia to figure out what it does. Take my explanation with a grain of salt of course (the “what I know” section of my post has most of the facts, the rest is my speculation). So long as I got all the the inhibitory/disinhibitory/etc. connections right, and my assumptions on the granularity of the correspondence in the loops are correct (that a signal from one part of the cortex should go through the basal ganglia and influence bias in the same part of the cortex, probably the same column), it’s the most plausible explanation that I’m aware of.

I think Jeff has said a few times that he thinks the Basal Ganglia is an action selector, as in the cortex sends it a union of SDRs for possible actions, and it filters all but one and returns it to the cortex as a bias signal. However, based on the structure of the basal ganglia, I’m going to have to disagree with him.

There’s a bit of extra circuitry in the basal ganglia that I didn’t bring up. Particularly involving a branch in the indirect pathway. I’ve found a few conflicting descriptions of it, some saying that there’s a direct inhibitory connection from GPe to GPi (as I described in my model), and some saying there is no such connection. Everything seems to agree that there is an additional connection however that goes through the Subthalamic Nucleus, though it seems to me (based on the inhibitory/disinhibitory signals) as though it’s doing the same thing as what the more direct connection would be doing, just with an extra nucleus it has to go through.

The signals from the basal ganglia back to the cortex (by way of the thalamus) are only excitatory from what I’ve read. No inhibition there.

I did forget though about the Betz cells though. Thanks for reminding me. Easy change. That just means that the biasing is mainly occurring in L5, the output layer. L2/3 etc. would likely go unbiased then, but biasing in the output is still what matters. (edit: I’m half-right here. Betz cells appear to only be in the motor cortex. The basal ganglia interacts with pretty much the entire frontal lobe. L2/3 etc. might still have biases, just not in the motor cortex. Read the rest of my earlier post; the TL;DR doesn’t mention that the hierarchy in the frontal lobe.)

Plus, I think my last point still makes a lot of sense here; if my model of the basal ganglia is accurate (and HTM of course), it explains diseases like Huntington’s and Parkinson’s really well, at least as far as I can reason. Maybe there’s something I’m overlooking and an experiment with my model will prove me wrong though.


I haven’t been following the conversation, so this might not matter, but Betz cells are probably only special because they’re huge. Many other regions have L5 cells which target the spinal cord, such as primary somatosensory cortex.

In reply to your list of facts:
I’m not sure, but I think most regions have corticostriatal neurons. S1 seems to project to the basal ganglia, for example.

At least for whisker primary somatosensory cortex, if I recall correctly, there are 3 thalamic inputs pathways. With anaesthesia, they convey different information (self-induced whisker movement, environments induced whisker movement, and movement regardless of cause). But I read somewhere that it’s more complicated without anaesthesia.
Source: Layer-Specific Touch-Dependent Facilitation and Depression in the Somatosensory Cortex During Active Whisking (by Derdikman et al)
Anyway, layer 6 sends feedback but it’s probably modulatory. Layer 5b sends to the nucleus for the self-induced whisking pathway (during anaesthesia at least), POm I think. It projects to most or all somatosensory/motor regions, including S1, which it targets at L1 and L5a.

According to this source, the basal ganglia deal (partially?) with combinations of muscles. Somatopic organization in rat striatum: evidence for a combinational map (by L L Brown)

A lot of this is based on memory so you may want to confirm some things, and maybe look for contradictory sources. I can give you more sources if you’d like.

Some speculation:
Some behavior isn’t so closely tied to reward. For example, when you touch an object to figure out what it is, or look at something unexpected, it might not be based on reward. It may still involve dopamine and the basal ganglia, but I’m not sure they should be learned based on reward. The brain should generate behavior intended to understand the world, not achieve reward, since most structure of the world is learned from self-movement.


I was mostly talking about the internal structure of the basal ganglia and the most prominent connections. I’m sure there are plenty of smaller connections to it; my hypothetical model of the basal ganglia would probably work better with additional sensory inputs to the striatum, as I’m suggesting that the striatum learns patterns associated with positive/negative reinforcement. Whether or not a particular muscle command is a good idea would be very dependent on the current orientation of said muscles.

As for your speculation, I think that’s a good point, one that I’ve considered myself. I think there are likely various conditions that trigger positive/negative reinforcement in the brain that aren’t necessarily directly associated with reward. For example, perhaps there exists some mechanism where the brain can detect how much learning is occurring, and trigger some slight reward as a form of “curiosity reward.” There may also be some form of feedback from the cortex to various reward systems, possibly allowing reward systems to change to some extent. For example, IIRC the amygdala gets some feedback from the cortex and also is in some cases able to recognize the emotional content of very complex events, much more complex than you would expect if it were hardcoded by evolution.


This would make sense. If actions are suggested locally (a tenet of HTM theory), there is a need for an evaluation of the outcome of the action not only from the point of view of the column suggesting it but for the benefit of the whole human host. The BG gates the actions with negative global expected outcome by inhibiting them.


From the cited paper: "In spite of the fact that the whole cortex is represented in striatum and movement is certainly not its only function, here we review thalamic output to motor cortex because of our long-standing interest in movement and the basal ganglia. Similarly L1 is not only present in motor cortex but invests the complete cortex. In theorizing about basal ganglia […] it is as important to remember that movement is only the most obvious output”.

Motor movement is only one of the two kinds of manipulations realized by our brain to produce the results it needs. There is external manipulation (movement) and internal manipulation (some forms of thinking). I don’t see why BG shouldn’t influence both, as both have costs and thus need cost/benefit analysis before action.


I like the conclusion that there would be things we are biased not to learn. E.g. telling people about the internet before it was so named, I found no one outside the profession who was not bored and irritated by the info; the least irritated were so because they could refute me with the miracle of the FAX machine. I get the same experience for a second time with my own project; I wait to see if the possible bias is overcome and anyone finds something on the other side.

My hope is that we can develop thinking machines that can figure out how to teach us about our biases. And they could maybe develop (even wet) machines to help them with their own biases.

[EDIT - switched link from MachineLearning to Futurology]


You may find this paper helpful:
How Basal Ganglia Outputs Generate Behavior - Henry H. Yin