Burst as a local learning rule in apical but not basal dendrites

bkaz · November 10, 2021, 2:31pm

Yes, if they form an internally self-reinforcing ensemble, then it’s one representation. For each constituent neuron, I think that means multiple SDRs / dendrites active at the same time. If that happens, the neuron should burst?

dmac · November 10, 2021, 2:41pm

Well, ideally, if all of the inputs are part of “one representation” then they would form synapses close together on a single dendrite, to enhance detection of the stimulus.

Neurons burst-fire when they are activated by both basal inputs (regular, non GNW inputs) as well as apical inputs (from the GNW).

bkaz · November 10, 2021, 2:48pm

My problem with that, inputs from different areas are not likely to target the same dendrite. They may or may not turn out to be co-active latter, but even if it’s the former, how do you physically splice / transfer synapses between dendrites?

Casey · November 10, 2021, 7:03pm

I think map-reduced competition is an interesting idea. To attend to part of the map, rather than selecting it in the big cortical sheet, it shrinks down to the size of the thalamus’s map, does competition, and then expands back up. It could re-add info lost in the shrinking using direct CC connections or preserve the info by spike patterns or something. That’d be a useful constraint, since it’d mean L5tt cells should be amenable to the loss of info or the means of compression.

That’d require more L5tt cells than TC cells. I couldn’t find cell counts for L5tt in a region and their targeted higher order thalamic nucleus, not that I did an exhaustive search.

I don’t understand the exact connection between those ideas. I know motor patterns need something like global consistency, since there are so many cortical motor outputs (all regions and probably all cortical columns) and that could lead to contradictory motor actions. I don’t think map-reduced competition could do that, because it doesn’t operate on a global scale and wouldn’t work for the motor outputs from primary cortical regions. It’d also be using a lot of the same mechanisms to do things with sensory information (including in higher order thalamic nuclei, because some of those are driven by a mix of L5tt and the same sort of direct sensory input which primary nuclei receive).

Alien hand syndrome is another thing to consider. (For example, someone with it took a cigarette out of their mouth while trying to light it). It results from injuries to the brain, including CC connections (corpus callosum) and some cortical regions. That suggests the thalamus isn’t in charge of motor consistency (although it could be involved or even solely responsible, because damage studies aren’t very clear.)

Which connections? For signals to cortex (i.e. neocortex), my impression is that info-containing signals only go through thalamus. (Maybe olfaction bypasses thalamus, but I think O1 is allocortex and might still go through thalamus to reach neocortex). Modulatory signals (e.g. serotonin) can go straight to cortex, but something like striatum wouldn’t do that because it does something different.

To avoid a potential source of confusion, cortical afferent means cortex sends signals to striatum, right? I’m just not used to the terminology.

There’s attention to things with selected identities but also attention to parts of space (I think parts of reference frames e.g. locations in physical space or the reference frame impacted by hemineglect, as well as parts of the sensor e.g. part of the eye’s field of view). I think WTA in thalamus would be for attention to parts of space since TRN connectivity isn’t very precise. It probably wouldn’t be able to select SDRs, since those are arbitrary sets of cells.

dmac · November 10, 2021, 8:03pm

Hebbian learning. Inputs will target all of the dendrites which they pass near to and whose activity they correlate with.

It’s not like synapses are going to physically up & move from one dendrite to another, but their presynaptic axons will likely pass near enough to multiple dendrites to form synapses in whichever spot seems most advantageous.

Here is an article that discusses the phenomenon:

Electrical Compartmentalization in Neurons
Willem A.M. Wybo,
Benjamin Torben-Nielsen,
Thomas Nevian, Marc-Oliver Gewaltig
2019
DOI: https://doi.org/10.1016/j.celrep.2019.01.074=

Figure 7 shows (their simulation of) the synapses moving dendrites.
IIRC they found that inhibition is critical for prompting synapses to move to different dendrites.

I’m assuming that synapses from TRN to Thalamus are formed using hebbian learning, in which case they are very precise with respect to the activity of the pre & post synaptic cells.

However, the word “precise” has many meanings. Maybe the TRN projections are not topologically precise?
[Edit: I think for many publications discussing the thalamus, topological precision is a big topic because so many of the cells in the thalamus are topologically precise, and its notable that the TRN isn’t]

Casey · November 10, 2021, 8:12pm

Yes, that’s what I meant. Now that I think about it, that doesn’t mean it’s competition on the level of maps because it could just be whichever fires first wins.

bkaz · November 11, 2021, 2:33am

I meant competition among motor patterns, not consistency within each. If you have a fork on the road, you can’t take both. Maybe not directly for primary motor cortices, but those are controlled by higher areas anyway. Which have something like higher-order motor patterns: strategies, etc.

Right. The signal is from higher association cortices: Striatum - Wikipedia
“The largest connection is from the cortex, in terms of cell axons. Many parts of the neocortex innervate the dorsal striatum. The cortical pyramidal neurons projecting to the striatum are located in layers II-VI, with the most dense projections come from layer V”

bkaz · November 11, 2021, 2:47am

Thanks, yes, I recollect that extracellular calcium attracts those axons. So a neuron has many SDRs / dendrites, each representing a separate ensemble. My question was, suppose those ensembles are initially unrelated, but then turn out to be coactive under some condition. This will basically combine them into higher-order ensemble. Does that mean some kind of WTA, redirecting coactive presynaptic axons to the strongest SDR, or they can stay where they are because that old SDR is still active, actually the most active if you adjust for the distance from previously associated axons?

dmac · November 11, 2021, 2:01pm

Yes! There is a competition among the dendrites. They’re trying to activate, and activity attracts more & stronger synapses, which drives further activation.

And also there is inhibitory input to the dendrites, which is important for controlling this otherwise unchecked positive feedback loop.

Also dendrites have limited space, which limits the total number of synapses they can have. (This fact prevents some run-away positive feedback loops).

IIRC from the article “Electrical Compartmentalization in Neurons”: inhibition effects dendrites by both inhibiting them directly as well as making it more difficult for nearby synapses to cooperate, which has the effect of “breaking” large dendritic segments into many smaller independent segments. Then those small dendrite segments can specialize in detecting very specific things. And when the inhibition goes away, all of those small specialized segments will cooperating again as one big general-purpose segment.

Casey · November 13, 2021, 5:49am

I was thinking about consistency among motor patterns. I think it needs to pick multiple, and they need to be self-consistent.

This might not be relevant to the discussion, because it's not about whether the thalamus does winner-take-all.

Some motor patterns aren’t consistent with each-other, whereas some are necessary together to do a task.
For example, don’t try to pick up a cup and scroll a mouse at the same time with the same hand, because that union of motor commands would have nonsensical results (useless, flail, tear muscles?). Whereas, reach for a cup with one hand and look at the part of the cup you’re reaching for, grabbing something with the other hand to balance if it’s a very heavy cup. Those are pretty basic movements, but you still plan them.

Multiple cortical regions plan things, in parallel. (Some planners are specifically for saccades, so a strictly linear command hierarchy doesn’t make sense, regardless of whether there’s a single top boss.)

The parallel plans need to be consistent to carry out the task if they all serve that task. Additionally, if they each have their own purpose, they shouldn’t interfere with each-other.
(It’s possible to carry out separate motor tasks simultaneously, if you queue them up and don’t think about it too much while executing them. You still need to plan them to be consistent. For example, people not me can jump out of bounds to catch a frisbee and simultaneously look for someone to throw to.)

Competition in TRN could pick a single best plan, but it probably can’t pick multiple. They need to be consistent with each other, so it can’t just pick the top k winners. Maybe some other form of competition could pick them, but the consistency / cooperation must be learned and is highly context-y, so cortex is a better fit. Also, TRN is fairly nucleus-specific, so it can’t do much competition involving multiple cortical regions.

Map-reduced competition is still an interesting concept. I’d never really thought about the implications of convergence in thalamus. Maybe it’s involved in removing ambiguity from cortical representations (unions of SDRs) so it can use that like a sensory input to higher order regions. That’d mean L5tt cells can fire to represent ambiguity.

dmac · November 14, 2021, 1:34pm

And i think that begs the question: what does it mean for a motor plan to be “consistent”? Likewise how could an animal possibly know if two actions can be done in parallel or if they will interfere with each other? Of course we as humans can stop to reason about it and run mental simulations even, but what about the squirrel who is trying to hold a nut and climb a tree at the same time?

My answer:
I think the only way to know for sure if a plan of action will work (and work as intended) is to try it out and see. The animal should receive a reinforcement learning signal at the end of every plan of action. Success or failure have rewards or penalties, otherwise why would the animal have taken any action at all.

And this is why i think that the Thalamus is using RL to control your attention, because what you pay attention to includes that “motor plan”.

bkaz · November 14, 2021, 3:14pm

We are not talking about consistency of primary motor patterns, thalamus is not directly connected to primary motor cortices. Probably because its WTA is too slow to control them, even if map-reduced. Rather, it is consistency of “plans”, which for an animal means moving around. That consistency can to a large extent be reduced to proximity: you can’t be in two places at once. Simplest case is fight or flight: you have to pick the direction.

Casey · November 14, 2021, 3:21pm

I think thalamic attention is also for surprise or detection, because of burst / tonic mode and L6ct signals. Those are ubiquitous, so if thalamus controls attention with RL, it needs to contend with that kind of attention too.

I think L6ct cells keep track of things*. Perhaps that has something to do with planning, since plans are a persistent thing too. Maybe it does RL attention and surprise/detection attention with shared mechanisms. I mean, if they’re both in thalamus, it’d be strange to do both forms of attention in totally different ways.

*(partially because they’re tentative grid cells in numenta theory, I’m still figuring details / wording of this, something like points of interest. Also see A novel somatosensory spatial navigation system outside the hippocampal formation, which found some grid cells in vS1 and narrows down possible layers they’re in. Another reason is thalamic burst / tonic modes suggest L6ct signals convey what it already has detected or maybe predicted, so if that’s correct, L6ct cells only coding location wouldn’t make sense.)

Thalamus is synaptically connected to all regions of cortex, including M1. Most things goes through it to reach cortex.

I think at the timescale of a few seconds, it’s planned. I can do lots of things simultaneously on that timescale, and there’s a lot you can do in that time so it needs to use sequences of more basic actions. 3000 ms is a lot for neurons.

If I plan what I’m going to do in the next few minutes, I’ll probably think about it, so it’s kind of general intelligence. I don’t think something basic like inhibition in thalamus is so directly linked to general intelligence. What you’re suggesting could be something a more basic purpose includes, though.

If you’re only talking about planning where to walk to, that seems too specific a function for thalamic inhibition.

bkaz · November 14, 2021, 4:16pm

It’s also where to look at, which is sensory but location-specific. Basically, in any spatial reference range that’s is significantly larger than your body, choice consistency is closely related to proximity. My point was that you don’t need RL to figure proximity.

Casey · November 14, 2021, 4:21pm

I guess I missed that in your discussion with dmac. I don’t really understand but my point is probably pointless.

bkaz · November 14, 2021, 11:07pm

Sorry it wasn’t clear. I happen to think that most of processing in cortex is purely cognitive (pattern recognition and generation), RL only modulates it on the margins.

Not to the same extent, it’s missing layer 4.

Is that for thalamocortical loop? I was talking about internal WTA, which should be a global multi-step process, I don’t even know how many steps.

I agree, but that also goes for “consciousness”, however defined. Or RL for that matter. I think GI is best understood as a marketplace of predictive value, not a centralized system.

Casey · November 15, 2021, 10:32am

Thalamus targets (sometimes drives firing in) other layers and M1 can have a subtle L4. Why do I even remember these things.

For ionotropic receptors, 100 ms is pushing things as far as I’ve seen, and 3 seconds is 30x that. Timescales like 3 seconds involve metabotropic receptors, network stuff, and/or maybe short term plasticity.

I think the problem with neuroscience is basically what you’re describing. Papers can try to figure out neural implementations of “attention” for example, but they can’t even acknowledge that “attention” is hypothetical once you fill in the slightest of details. The brain has no obligation to perfectly match all the implications of that word.

That’s why I think the real task is to figure out what to even design.

bkaz · November 16, 2021, 3:03am

Well, the most general part is the neurons, minicolumns, and neuronal ensembles, especially in L2-L3. That alone may be most of GI, so Numenta is right to focus on it.
That’s if you want to stay neuromorphic, which I don’t.

dmac · November 17, 2021, 11:54am

I agree. I had to re-read Sherman’s 2017 review and to think about this.

I think it suggests a more general definition of “attention”:

High frequency bursts of AP’s are the way that neurons signal that they are transmitting information that is relevant to the animal.

Note that for an input stimulus to be “relevant to the animal” is a much higher bar than just “present in the input”.
There are many reasons why information might be “relevant to the animal”, and so there are many different mechanisms which control burst-firing.
When the thalamus detects a surprising input, it burst-fires because the unexpected input needs to be immediately processed and potentially acted upon. Surprising input is relevant to the animal because surprises can hurt!
The cortex has its own suite of mechanisms for detecting and acting on burst firing: in the apical dendrites and in inhibitory Martinotti cells.

[EDITED: in response to @Casey’s comments. I removed the word “attention” as it’s rather vague.]

Casey · November 17, 2021, 12:00pm

I didn’t mean the task is to figure out what concrete things to design, e.g. minicolumns. I meant what functionality to create. For example, it’s obvious that the brain recognizes the image on the retina at any given instance. But TBT suggests the brain does something else.

Assumptions like that can cripple progress, and they’ve happened multiple times. This isn’t just a problem for brain-based AI, I think, because of AI winters. We can’t take vague concepts like “attention” for granted. They’re just english words or phrases. Other languages conceptualize things differently.

A research field is a collective superintelligence, so it’s human-level in some respects, including broad assumptions. It should use its superintelligence to address its weaknesses, including assumptions. Human-level intelligence is nowhere near enough. Einstein took hundreds of millions of seconds to create his ideas. We’re all kinda dumb from moment to moment.

We shouldn’t try to be smart. We should try to be very good at doing dumb thoughts right.

Topic		Replies	Views
Larkum 2013 & A State of Attention Tangential Theories	31	2379	September 17, 2018
Numenta turns attention to The Thalamus! Lounge	31	4205	November 9, 2021
Non-motor Regions Creating Motor Commands? I'm Skeptical Tangential Theories basal-ganglia , displacement-cells , thousand-brains	9	1725	September 6, 2019
Is there a Basal Ganglia theory equivalent to HTM? Numenta Theory	15	2639	December 5, 2017
A book about dendritic computation and cortical layers Tangential Theories	18	2062	October 3, 2017

Burst as a local learning rule in apical but not basal dendrites

Related topics