What is the Cerebellum doing?

thanh-binh.to · July 9, 2023, 12:11pm

It is clear because the cart-pole can not be successfully controlled by classical controllers.
I am also interested in how you implement the negative controllers based on the current HTM.

Could you @dmac please explain it more in details or can you share us your codes?

Thanks

cezar_t · July 9, 2023, 2:01pm

@thanh-binh.to I don’t know what you mean by “classical controller”, but a PID controller with the right parameters will easily balance the cartpole. At least that’s what hoverboards and other two-wheel balancing bots use.

It is not clear in @dmac’s movie whether the cartpole is applying a learning algorithm to update its coefficients over the 3 episodes, or if they are showcasing the same set of already trained, (or hand-tuned?) coefficients.

All three seem to slowly increase oscillating (and over longer episodes would fall), which a PID would be able to dampen.

dmac · July 9, 2023, 3:27pm

I use the HTM to predict one timestep into the future. It predicts the sensory input that it will see, given the current motor action.

The way this works is:
First i ask the simple controller what it wants to do, to keep the pole balanced.
Then i ask the HTM predictor what it thinks is going to happen as a result of the controller’s action.
Next i tell the controller “here is what’s going to happen as a result of what you just did” and then i give the controller a chance to correct any mistakes in its actions.
Finally, i combine both the original and the revised actions, send it to the motor, and wait until the next timestep.

thanh-binh.to · July 9, 2023, 3:48pm

@cezar_t yes, something like PID, at my PhD-student time 30 years agoafter tuning we can get it works for some limited episodes. After we tested with ANN and get better results.

For HTM based RL I use the idea of Profile - sunguralikaan - HTM Forum
and only get max 49 episodes (best run)

thanh-binh.to · July 9, 2023, 3:51pm

Thanks for your feedback @dmac !
How many episodes can you balance the cart pole?

dmac · July 9, 2023, 3:59pm

The coefficients for the closed loop controller are hardcoded constants. Only the HTM-based predictor learns. The video shows the first three episodes, demonstrating zero-shot online learning.

cezar_t · July 12, 2023, 1:26pm

Sounds interesting, did you also shared anywhere this solution for CartPole, like github?

thanh-binh.to · July 12, 2023, 1:50pm

you can get his SW in Python here

and adapt it for Cart-Pole.

cezar_t · July 12, 2023, 6:51pm

Thanks, that’s quite intimidating.

It seems it attempts to solve environments in “visual” mode by learning (and simulating) feedback via mouse interaction which is not what simple gym tasks (like CartPole, MountainCart, LunarLander, etc…) are meant for.

Therefore a very complicated structure for anyone trying to approach simple RL and HTM problems via this route.

thanh-binh.to · July 13, 2023, 6:48am

cezar_t · July 14, 2023, 12:59pm

Exactly that, it is not very helpful. Most software/ ML people did not graduate in neurology to know what those cryptic Letter-Digit codes mean.

dmac · July 28, 2023, 2:41am

The forward model of hypothesis was discovered in the context of process-controls-engineering, by Otto Smith in 1957, and applied to the cerebellum before I was born.

Overcoming process deadtime with a Smith Predictor

A controller equipped with an accurate process model can ignore deadtime. Deadtime generally occurs when material is transported from the actuator site to the sensor measurement location. Until the material reaches the sensor, the sensor cannot measure any changes effected by the actuator.

By Vance VanDoren, PhD, PE February 17, 2015

For the purposes of feedback control, deadtime is the delay between the application of a control effort and its first effect on the process variable. During that interval, the process does not respond to the controller’s activity at all, and any attempt to manipulate the process variable before the deadtime has elapsed inevitably fails.

[…]

Read More At: https://www.controleng.com/articles/overcoming-process-deadtime-with-a-smith-predictor/

Is the cerebellum a smith predictor?

Miall, Weir, Wolpert, Stein (1993)
https://doi.org/10.1080/00222895.1993.9942050
Free PDF: https://wolpertlab.neuroscience.columbia.edu/sites/default/files/content/papers/MiaWeiWol93.pdf

Abstract

The motor system may use internal predictive models of the motor apparatus to achieve better control than would be possible by negative feedback. Several theories have proposed that the cerebellum may form these predictive representations. In this article, we review these theories and try to unify them by reference to an engineering control model known as a Smith Predictor. We suggest that the cerebellum forms two types of internal model. One model is a forward predictive model of the motor apparatus (e.g., limb and muscle), providing a rapid prediction of the sensory consequences of each movement. The second model is of the time delays in the control loop (due to receptor and effector delays, axonal conductances, and cognitive processing delays). This model delays a copy of the rapid prediction so that it can be compared in temporal register with actual sensory feedback from the movement. The result of this comparison is used both to correct for errors in performance and as a training signal to learn the first model. We discuss evidence that the cerebellum could form both of these models and suggest that the cerebellum may hold at least two separate Smith Predictors. One, in the lateral cerebellum, would predict the movement outcome in visual, egocentric, or peripersonal coordinates. Another, in the intermediate cerebellum, would predict the consequences in motor coordinates. Generalization of the Smith Predictor theory is discussed in light of cerebellar involvement in nonmotor control systems, including autonomic functions and cognition.

steve9773 · August 1, 2023, 2:36pm

Hey, good find!! I think that Miall, Weir, Wolpert, Stein paper is proposing essentially the same model as the one I like and described above. Or at least it’s related. I have just added it to my blog post as a reference. Thanks again.

Topic		Replies	Views
[Video Lecture] The Brainstem & The Cerebellum: Basic Operating Principles General Neuroscience cerebellum	1	429	August 27, 2023
Big picture of supervised learning in the cerebellum and amygdala General Neuroscience	9	1298	December 21, 2020
Neural Learning is Episodic General Neuroscience	1	614	July 20, 2023
Network model with internal complexity bridges artificial intelligence and neuroscience General Neuroscience	0	112	September 7, 2024
A Theory of Cerebellar Function (Albus) General Neuroscience	0	746	August 22, 2016

What is the Cerebellum doing?

Overcoming process deadtime with a Smith Predictor

A controller equipped with an accurate process model can ignore deadtime. Deadtime generally occurs when material is transported from the actuator site to the sensor measurement location. Until the material reaches the sensor, the sensor cannot measure any changes effected by the actuator.

Is the cerebellum a smith predictor?

Abstract

Related topics