Proof of concept Basa Ganglia



I created, tested, and analysed a proof of concept for the Basal Ganglia. The model does not include any sort of motor control so it is not useful but I think it’s an interesting stepping stone. This work is based on the work of (Sungar, 2017). The biggest difference between their model and mine is that mine models the Globus Palidus in detail.


Thank you for reading,

Why Does the Neocortex Have Layers and Columns, A Theory of Learning the 3D Structure of the World

I have some simple questions. I don’t know much about the Basal Ganglia or reinforcement learning, so bear with me.

  1. Why is it important to have two pathways (stiatums)?
  2. Is the reward some function of how well the HTM is performing?
  3. Are you using resets in the TM between sequences during training?

  1. This model would probably work just as well if it had a single pathway through the Striatum because the cortex in this model is very small.
    My hypothesis is that the Striatum does not represent the same information as the cortex does. The Striatum only represents cortical information which is relevant to the basal ganglia. The purpose of the Striatum is to reduce the number of neurons which the Globus Palidus (G.P.) must consider without losing information. The D1 and D2 pathways through the Striatum specialize in representing positive and negative events (respectively). Not included in this model is a third type of Striatum Neuron which contains both D1 and D2 receptors, and I hypothesize that these neurons participate in both of the Striatum pathways.

  2. The reward is what the basal ganglia is trying to predict, and the expected value is the basal ganglia’s prediction of the sum of the rewards it will receive in the near future. This model has no control over the reward. The TD error is a function of how well the basal ganglia predicts the rewards.

  3. I do not use resets in the TM. Instead I show the TM 2-4 random inputs between sequences which I think has the effect of preventing predictions from persisting between sequences. Also the order that the sequences are show is random so any predictions from one sequence to the next should be meaningless and should not occur often.


There are some odd interpretation of the two channels; things like go/no-go and such.
Once you place the channels in a larger framework of motor control most of this falls away and it starts to look like the true purpose is something more mundane like extend & retract.
Please see this paper, particularly section 3, with attention to the bit about posture - a mouse balancing against external forces.
Henry H. Yin, How Basal Ganglia Outputs Generate Behavior

I hope this will give you some background to make an informed evaluation of this proposed model.


Then where to the rewards come from?