Hi folks, I wrote this blog post as I tried to understand dopamine in the brain—which involves digressions into the amygdala, mPFC, basal ganglia, Randall O’Reilly’s “PVLV model”, RL, nature vs nurture, the AI alignment problem, the embryonic organization of the telencephalon, and all sorts of other fun stuff. (Including a lot of ideas that I first heard about from y’all here on HTM Forum!)
It seems like whenever I post my stuff here I get extremely helpful feedback, so here’s hoping for more of that! Thanks in advance!
Hello Steve. As a recent arrival at this forum, I’m happy to see that Randall O’Reilly is on the radar. I’m interested in the amygdala - locus coeruleus - norepinephrine pathway as it relates to attention (and, er, David Chalmers’ “hard problem”.) Sometimes I like to oversimplify norepinephrine as the “on” switch and dopamine as the “off” switch. Haven’t had time yet to search for Numenta-based projects modeling the limbic system and basil ganglia.
To be clear, as far as I know, Numenta staff works exclusively on neocortex, thalamus, hippocampus, and entorhinal cortex. But other parts of the brain do come up semi-regularly here on the forum, and I’ve learned a lot browsing through those conversations. (…And several people here may read my blog post at the top of this thread and recognize a butchered version of their own ideas…)
Sorry for the delay in my response. This speculation is a work in progress with many gaps:
To have an artificial intelligent agent with emergent behaviors appropriate to all its natural environmental circumstances, rather than having one cost function, as in ML, it’d likely have dozens, hundreds or thousands of cost functions. Those cost functions could in part be supplied by an analog to the amygdala, as routed through the locus coeruleus’ (LC) targeted release of norepinephrine.
I understand norepinephrine as having a major role in focusing attention, with the effect of priming neurons in regions that are relevant to the situation that aroused the amygdala in the first place. So those regions would tend to win neural competitions (Gerald Edleman’s neural Darwinism). The agent focuses on the problem at hand. In short, it could be the “on” switch.
For this to work, the norepinephrine delivery would have to be specific enough to target rather specific cortical areas (something I don’t know at this point).
On the flip side, dopamine release ultimately leads to a signal that “you’ve arrived. You can stop that energy-intensive activity now,” i.e. the “off” switch. Since the ventral tegmental area dopamine can release dopamine at intermediate stages during process toward a goal, there has to be gaps in this story; the release of dopamine alone doesn’t appear to be sufficient to end the focus of attention.
The general idea is that models of these two neurotransmitters could serve the role of a cost function to direct an artificial organism to its goals, with feedback loops providing an analog to gradient descent (as in the neocortex, nucleus accumbens, … back to the LC and the amygdala).
For me, some of the interesting questions are:
Could this be a key, possibly necessary, approach to the creation of a fully autonomous agent, such that it appears to be fully “alive” within its problem domain, with behaviors that are (almost) always appropriate to its environment (starting, say, at a level of complexity of an honey bee)?
If so, how simple could a neurotransmitter model get and still provide this behavior? Are two neurotransmitters enough, or do you need six? Would you have to model most of the regions seen in biological organisms, or could you get away with a dozen?
Is something like this currently part of Numenta’s architecture (my guess, probably “yes”)?
I dunno, imagine you’re playing a video game, and you just got an invincibility superpower and now you’re just tearing through all the enemies, racking up points, bam bam bam. I feel like this is a long stretch of both dopamine / reward, and excitement / arousal. I don’t think there’s a sense of “Yay I got my dopamine now I’m done and now I can relax”.
More generally, if there’s a signaling mechanism that means “hey cortical column, your last action was very good; you should repeat it in the future”, and there’s a signaling mechanism that means “hey cortical column, you need to be paying attention and roused for action right now”, … these don’t seem to me like they’re opposites. They seem orthogonal. Different situations might call for both signals, or neither, or one, or the other, seems to me.
FWIW I don’t know whether norepinephrine delivery is “specific enough to target rather specific cortical areas”, but I have seen evidence that ACh is pretty specifically targeted. (Example paper.) I’m planning to write a blog post about that in the near future, I think it’s important for my project. (Some good references on ACh specificity are in Marblestone Wayne Kording, although as mentioned I don’t agree with the functional role they’re ascribing to ACh.)
I’ve been finding this all over the place lately. Multiple hormones and brain regions playing a somewhat similar role. Design by Darwin. What ever happened to the DRY principle (don’t repeat yourself)? Or, in the words of Isaac Rabi, “who ordered that?”
I don’t rule this out, but for the record, I think that when we see that type of thing, it’s far more probable that we’re misunderstanding what’s going on.
I think of the book The Accidental Mind by David Linden. His favorite example which he kept going back to is how there’s a sensory-processing system in the brainstem (superior colliculus etc.) and also a sensory-processing system in the cortex (V1 etc.). “Ha ha, silly evolution, making two redundant systems”, says Linden, about 5000 times throughout the book.
…But in fact he was just (IMO) totally misunderstanding key aspects of brain design. The two sensory-processing systems are solving two totally different problems that call for radically different types of algorithms. In fact I think it would be absurd for there to be only one sensory-processing system in a human brain, given its suite of capabilities. And when people make AGIs someday, I absolutely expect them to put two separate sensory-processing modules in the code.
(Details if you’re curious: I think the brainstem one needs to take incoming sensory data and use it to answer a finite list of genetically-hardcoded questions like “Is there something here that looks like a spider? Is there something here that sounds like a human voice? Am I at imminent risk of falling from a great height? Etc. etc.” And it needs to do that from the moment of birth, using I guess something like hardcoded image classifiers etc. By contrast, the cortex one is a learning algorithm. It needs to take incoming sensory data and put it into an open-ended predictive model. Whatever patterns are in the data, it needs to memorize them, and then go look for patterns in the patterns in the patterns, etc. Like any freshly-initialized learning algorithm, this system is completely useless at birth, but gets more and more useful as it accumulates learned knowledge, and it’s critical for taking intelligent actions in novel environments.)
Hmm, my prior belief was that there should be a snake-detecting circuit in the superior colliculus not pulvinar. The paper at least doesn’t disprove that: it says “The medial pulvinar receives
direct inputs from the retina (13, 18) and the deeper layers of the superior colliculus (15, 19)”, so maybe SC detects the snakes and tells the pulvinar? It also says “Infant monkeys with bilateral neurotoxic lesions of the superior colliculus continue to reach for food in the presence of a snake model, whereas sham-operated monkeys avoid the food (21).” and “Studies of humans have implicated a pathway involving the superior colliculus, pulvinar, and amygdala in fast, automatic visual detection of fear-related stimuli, including snakes, at low spatial frequency (25–31).” (In my model, basically the SC would have the genetically-hardcoded snake template, the amygdala would gradually learn about snakes from the SC, and pulvinar would be, I dunno, some kind of interconnect or something.)
My vague impression was that the pulvinar (or thalamus more generally) doesn’t really have the computational horsepower to detect snakes. Everything I read about the thalamus sounds like “a handful of simple calculations repeated many times”. By contrast, the superior colliculus has a bunch of layers connected in series that remind me (very vaguely) of a ConvNet. I could be wrong; I’m very much not an expert on either SC or pulvinar
I think I once heard that parts of the frontal or prefrontal cortex connect to the dopamine producing neurons themselves. Could be wrong, but if that’s the case you could imagine that the cortex could be another source, computing and providing its own alternate reinforcement signals to those computed in earlier brain areas. A feedback loop of potentially arbitrary reinforcement nature.
I’m thinking that this calculation happens right in VTA/SNc, with Reward coming from the hypothalamus (or maybe from a different part of the brainstem, I forget), and Reward Prediction coming from somewhere in agranular prefrontal cortex (or neighboring agranular insular cortex or agranular cingulate cortex) (I expect the signal comes to VTA/SNc both directly from agranular prefrontal cortex, and indirectly via ventral striatum or ventral pallidum).
To be clear, that’s my guess. I haven’t looked deeply into the literature on what’s connected to what and with what neurotransmitter etc. Please chime in if you know anything.
I can’t currently imagine any good (evolutionarily adaptive) reason for a cortical region to directly calculate an RL reward for a different cortical region. Can you think of a story like that?
My thinking was that the reward signal is calculated by the globus pallidus. The GP has two parts which inhibit each other, allowing it to subtract positive & negative rewards to find the net balance of the reward. Dendrites in the GP are unique: the dendrites are almost entirely covered with synapses, and presynaptic axons can run alongside a dendrite and make many synaptic connections to a single dendrite. This would allow it the assign a “weight” to the connection by varying the number of synapses.
So that arbitrary societal goals can be inscribed into the new child’s mind. The way I see it is the higher areas might have mental models of other social agents, and may infer or learn their goals and use those as basis for reward signals.