Fractal brain: neural ensemble tree replicated in dendritic tree?

I asked for help with that question, from James Johnson, who emailed back this to get us started:

I’m traveling, but will be responsive on or after June 25. I’d be interested in trying my hand at introducing the research to a new audience and understanding the perceived applications.

I’d direct your group to these texts for some lighter reading.

Dante R. Chialvo. Emergent comlex neural dynamics. Nature Physics, 2010.

https://www.quantamagazine.org/toward-a-theory-of-self-organized-criticality-in-the-brain-20140403/

https://www.quantamagazine.org/brains-may-teeter-near-their-tipping-point-20180614/

https://www.quantamagazine.org/do-brains-operate-at-a-tipping-point-new-clues-and-complications-20190610/

And here is a comprehensive text:

https://onlinelibrary.wiley.com/doi/book/10.1002/9783527651009

Sent from an Android device. Please excuse the typos.

Best regards,
James K Johnson

One of the articles includes the word “clues” and I’m still confident it’s much like from my past experience where there is a wave propagation related unknown. Information from real neurons showing a peculiar tipping point like this is something theoretical models must account for, and also may use to help figure out how a still largely mysterious mechanism works.

1 Like

I think it’s two separate questions: exact dynamics vs. architectural mapping of cross-scale replication. You are exploring the first and I am mostly interested in the second: do the neurons represent ensembles only in cortically lower areas: V1 vs. V2, etc.? The dynamics of it is interesting too, but I suspect it would be different on ensemble vs. dendritic tree levels.

Bifurcation Theory
" Most commonly applied to the mathematical study of dynamical systems, a bifurcation occurs when a small smooth change made to the parameter values (the bifurcation parameters) of a system causes a sudden ‘qualitative’ or topological change in its behavior.[1]"

“In mathematics, particularly in dynamical systems, a bifurcation diagram shows the values visited or approached asymptotically (fixed points, periodic orbits, or chaotic attractors) of a system as a function of a bifurcation parameter in the system. It is usual to represent stable values with a solid line and unstable values with a dotted line, although often the unstable points are omitted. Bifurcation diagrams enable the visualization of bifurcation theory.”

First-author here! Feel free to ask direct questions. I’m pleased to see connections we had not made (searchlight hypothesis in particular). As a physicist I can see epistemological differences that can affect the proper placement of the work in a set of possible applications, I’ll gently address those at the end.

It looks like folks here really care about two main things: What does this imply about neuronal integration of information (ala capsNet), and there is an interest in trying to relate it to the physical architecture of the brain. Personally, I also care about exploiting what we found in real neural systems.

Neuronal Integration of Information:
There is a rather “drastic” limitation to the way neurons integrate information, they cannot be logic gates. I mean in the traditional sense where there are two populations inputting to the neuron and the neuron fires if only one or the other population is active but not both, etc. It’s easy to see why: If both populations are active than an MEA might show a very big “neuronal avalanche”. If the neuron is computing an XOR gate then it would show a very small “synaptic avalanche”. So here we would see that small synaptic avalanches would be over-represented and large synaptic avalanches would be under-represented when compared to the neuronal avalanche counterparts. The same is true for an AND gate but not an OR gate. We mostly cut this from our discussion. The effect is limited so a neuron could be weakly XOR-like or weakly AND-like and cross-scale inference is still possible.

However, we think there is a way out of this. Our model enforced the condition that each neuron’s instantaneous firing rate was equal to the population firing rate of its input population. This is the same as saying that each neuron computes the average or the sum of the firing states of neurons within its presynaptic population (OR-gates only). If a neuron is OR-like when the level of synaptic bombardment is low, but then, as bombardment and membrane potential increases it becomes more “focused” on certain populations of inputs then it could potentially stay true to “cross-scale inference” while effectively spiking as an AND-gate. This would defy our model’s conditions for criticality (but that’s not worrisome, see below).

I also saw, in the title and a couple other times, the question of whether single neurons represent ensembles. This is highly relevant to work we are proceeding with now. If our model’s conditions for criticality hold (each neuron’s instantaneous firing rate equals the fraction of neurons which fired in its presynaptic pool), then the question becomes whether single neurons always represent the same ensemble. If, like the searchlight hypothesis, the inputs can be “focused” (some become active, some become silenced) the neuron may not represent an ensemble, but may still allow cross-scale inference.

Exploiting Cross-Scale Inference
We debated publishing in J Neurosci because there is a lot of methods detail that is worth including in a supplemental, but not in the main paper which is already quite long (they don’t allow supplementals). This will make reproduction more challenging, but we do expect to share the code eventually. We won’t share it until it is refactored according to publishing standards, fully commented, and unit tested on other data.

This method (using the statistics of signal geometry) would probably only work for long-time-averages and it requires careful normalization. In otherwords, it needs a lot of data and very careful conditions. You can find in my YouTube Profile another video explaining preliminary results on efforts to get a more instantaneous view. We greatly improved results by completing the hyperparameter optimization effort that was unfinished at the time of that video and are just about ready to submit.

Relating the work to the physical architecture of the brain
This is basically an open frontier, and requires some knowledge about this “criticality stuff”. Basically, everything we physicists think we know about “self-organizes-criticality” comes from computer models, it’s never been definitively observed in the wild. In those models, the signal geometry relates very strongly to the physical geometry of spreading activity. Because synaptic connections can skip around neighboring sites it’s incredibly hard to observe this in neural data.

I can link to articles about it, but the takeaway is these three facts: 1) There must be a relationship to connectomics but we don’t know it. 2) The fractalness of the brain’s physical architecture is probably coincidence (scale-free dynamics do not require scale-free networks). 3) In most models “criticality” is enforced by setting precise physical parameters (e.g. connection strengths/probabilities), not dynamical parameters. We can only detect “signatures” for criticality by looking at dynamics but these can be buried if exogenous drive overwhelms endogenous dynamics.

Epistemologies
The best purely academic reading, aimed at explaining this physics stuff to a non-physicist, is this article: Emergent Complex Neural Dynamics

mechanisms
The idea of self-organized-criticality (SOC), if true, is an explicit alternative to the concept of a “mechanism”. This is because of the better-known idea of “emergence”. If a person is studying emergent phenomena in an SOC system they might find a mechanism, say for a response to a kind of stimuli. They might demonstrate sufficiency, and everything seems to check out. Then somebody comes along and lesions a part of their mechanism… The response is still there, maybe it’s different but it’s pretty much the same. Our very clever person finds a new mechanism! Somebody lesions it too, and maybe it requires lengthy recovery or even rehab, but the animal can still be shown to have the response. This goes on ad nauseam until the researcher decides responses have very many “sufficient” mechanisms but rarely have “sufficient and necessary” mechanisms. The relevant fact about SOC systems is that they are “metastable”, they are always in flux. This means they are always “exploring” slightly different modes of operation. If there is an external optimizer (e.g. a reward circuit or protocol), then it is not limited to any one solution/mechanism.

This is very much a physics kind of idea. Hydrogen emerged out of the stew of particles after the big-bang because there is a “reward circuit” in the form of energy minimization. Naturally, the number of mechanisms governing the bonding of particles is smaller than the number of paths through a network so the analogy is imperfect. The general idea is called an “extremal principle” (it has a wikipedia page). Extremal comes from “extremum”, another word for the minimum or maximum of a function, or the optimum of an optimization problem. Extremal principles only apply to systems in flux. Being SOC is one way to be in flux. Simply apply an optimization constraint and the system will find the mechanism which is optimum given all the constraints on the system, including the one you just added. A physicist thinking this way about implications of SOC would not really care about naming the mechanism behind the implication, we just assume there is always at least one, if not infinitely many.

critical branching
If you read the newest Quanta Magazine article I shared, you’ll see that we don’t really know what the brain looks like when it is “subcritical” or “supercritical”. However, for critical branching networks, we do know, and the “off-critical” brain does not look much like the “off-critical” branching network. This is no problem.

I’d ask you to read the Wikipedia page on “universality classes”. AT CRITICALITY the details of your model are irrelevant to the geometrical statistics we gathered. The further a system is away from criticality the details become exponentially more relevant. We use critical branching because it is easy and somewhat plausible. If we add inhibitory sites to the branching network we get the oscillations and synchrony discussed in the Quanta article (though the researchers interviewed might not be aware).

This too is a pretty weird way for a non-physicist to look at things, but it has a pretty big plus side: It reduces the number of complicated ideas one has to understand. For questions limited to signatures of criticality one probably has infinitely many models to choose from, so you don’t have to care about details, but do have to keep an open mind about off-critical things. For example, we showed that cross-scale inference is best at criticality in our model. For the real brain, the critical point behavior is the only thing that is definitively transferable, so all we really know is that at criticality cross-scale inference is pretty good. I doubt it’s better away from criticality because the correlation length and time are shorter away from criticality, and our model is vaguely similar, but this is merely common sense, not proof.

There are also two fun conjectures stemming from universality that I like to think about and give a sense to how brain scientists can use it. 1) Different brain systems could be critical in different ways. Maybe V1 is critical with the branching/averaging action that we describe, but M1 is critical some other way. 2) Criticality may operate like a clutch or differential in an automobile. Non-critical phases are strongly stereotyped, they are “attractors” and will “pull” the system into strongly patterned behaviors. If two different brain systems do have very different architectures then being at the critical point (where no pattern can dominate) may facilitate communication.

Summary
The fact that synaptic avalanches and neuronal avalanches scale the same way limits the ways that single neurons can act like logic gates. This suggests feature-selection type integration.

Importantly we have shown that single-neurons can act like sensors of larger network activity and that subthreshold fluctuations of membrane potential have fine structure which carries information long neglected. Soon we’ll put a paper on arxiv or bio-arxiv that shows how to get stimulus information out of the fluctuations.

The relationship to brain architecture and fundamental mechanisms is both an open frontier and a red-herring. Criticality begets universality and emergence, these are mechanism and architecture agnostic properties that would be very useful for any stochastic self-assembling and self-regulating information processing system to have. While universality and emergence are agnostic to the particular arrangement of matter, criticality itself requires a specific arrangement and that is usually very fragile.

These concepts require an abandoning the goal of nailing down all the details of a system, or the precise sequence of an action. We are approaching the brain as a non-equilibrium statistical mechanics problem. In such systems, one can get very precise and reliable behaviors and actions via odd kludges pulled from an infinitely large grab bag of mechanisms and sequences. It may be different from brain to brain, or day to day. This is both liberating and limiting. Some mechanisms in the brain are obviously quite specific and do have both necessary and sufficient mechanisms (e.g. amphibian looming reflex) so this simply isn’t applicable to all things.

5 Likes

Thanks for joining and explanation!

Do you mean separate branches of dendritic tree representing different ensembles, or different synapses of the same structure can do that?

I am a bit curious why you think that logic gates are an important part of brain function?
If the brains purpose is a memory unit all that is necessary is a faithful recognition/reproduction of presented patterns. (And a learning method)

The predictive brain model that is at the heart of HTM is essentially a serial memory unit where an input pattern recalls some matching patterns and as the perceived sequence progresses - reduce the potential pool of patterns down to some minimum and generate the matching future projections that go with that remembered pattern.

4 Likes

I don’t regard logic gates as an important part of brain function, just a heretofore possible part of brain function and a well-known proposal for the action of dendritic computation. I use the term logic gate as a shorthand for a class of neural actions, irrespective of any semiotics. The HTM proposal does not exclude logic gates, but a purist might resist calling them logic gates in the HTM context. In any case it’s the interpretation that differs but it’s the same neural action: If two populations (A & B) being active at the same time results in synaptic bombardment (measured at the soma) that is less than the synaptic bombardment caused by just A or B being activated then the neuron is computing an XOR between those two populations (maybe I shouldn’t call it an XOR, but I don’t have more precise language).

If A & B are 30% or more of the total presynaptic population then we believe this cause a mismatch between MEA and Vm results that we could have detected in our data. This means that if an XOR or AND like operation is even a moderate component of the transformation of synaptic activity to fluctuation measured at the soma then we would have detected more error than we found. I can’t be sure about the 30% number that’s a rough recollection (we cut it from the paper). It illustrates the point, there is some limit to the divisive or multiplicative operations that dendritic architectures can perform on the firing rates of presynaptic populations, otherwise, our results would not be possible.

1 Like

Neither option fits well. There may be implied meaning that I miss, but it sounds like either a different view of neural action needs to be entertained, or it was missed that we only record from the soma and don’t discuss dendritic architecture. “Neural action” is defined by us as the input-output relationship between the spiking states of the presynaptic population and the spiking state of the neuron we are interested in (it’s very general). So the “focus” could arise from: 1) interactions among the presynaptic population, 2) the cumulative effect of the entire dendritic arborization, 3)
an interaction between a couple of branches in a downstream trunk, or 4) through specific mechanisms within single dendrites. It’s all the same to us, dendritic architectures are black boxes in our model and to (all?) statistical physics perspectives of brain function.

When the neural action computes an average, there is no reason the population it is averaging over couldn’t be two or more synapses upstream. In otherwords, the immediate presynaptic population and totality of dendritic architectures can “compute” things about far-removed populations. So the mechanism of “focus” could literally be imposed by the thalamus on the inputting population or local circuits (therefore changing in time), or it could have some other origin. The point is that our conditions for criticality and cross-scale inference (computing an average) can accommodate neural actions that are a lot more complicated than our simple model (neurons do represent ensembles in our model).

Neural action is relatively well understood, so I suspect it’s the latter. Sorry, I am interested in the architecture, it seems too complex to serve no function. So, one possibility I see is that LTP / LTD may work within dendritic branches, without necessarily causing axonal spike?

There are probably infinitely many configurations of dendritic architecture and mechanisms of synaptic interactions which can result in the class of neural action we specify. That’s why we don’t discuss it. It could be LTP/LTD within dendritic branches or something else entirely. One example is tight balance which can result in somatic Vm fluctuations that track excitation among the population 2 synapses upstream. That’s just an example of something that would be sufficient for our purposes. No matter how complex the neural action is, so long as the instantaneous firing rate of the neuron at time t approximates any proximal neural population at time t-Δt our simplistic conditions for criticality and cross-scale inference may hold. We showed that if
Δt
is fixed and the population is fixed then it holds, we are examining variable
Δt and a “focus” on an upstream population that changes in time.

I’d love to see a larger set of possible mechanisms, but I don’t have that list! For now, we just assume there is at least one, if not infinitely many. Otherwise, we would not have gotten these results. If it interests you please share, but understand I can’t narrow it further.

Another odd fact is that in our searches we found no literature about dendro-dendritic synapses in V1. They are prolific in the thalamus, and have been found in motor cortex, but either nobody has looked for them in lower visual cortex or no one is publishing negative results. There could be some really interesting interactions if they are extant in lower visual cortex.

The problem I have with the computational dendrite is fairly simple - how does the haystack mess that is the dendrite arbor end up being configured into a useful circuit?
I can see how the fire together- wire together thing turns into a memory particle but I can’t even start to see how useful logic element and chains of these elements would form.

So many thing have to end up connected right to do anything useful:

Well, for AND | OR logic you need inhibition, so one of the inputs must pass through interneuron, probably in the same minicolumn?

I think you missed the word userful in that sentence.

So you have a tree with a bunch of essentially randomly positioned gates. I have spent many years working with digital logic and making the connections so the gates do something that makes sense is a challenge; VSLI gate arrays do not wire themselves.

Before you offer this as any sort of plausible mechanism I need to see the companion explanation on how these logic elements will be self-configured in vivo.

I recollect that V1 is one area that is not connected to basal ganglia. So, an off-hand guess is that dendro-dendritic synapses may depend on Dopamine?

Actually, hasn’t Genetic Programming been able to beat human VLSI
designers in many cases? I’m no silicon guru, but I’d heard (from
the guy who holds the patent on GP, mind you) that it was quite
popular in the VLSI world now.

Things like fire-separately, wire-together, for neuron-interneuron synapses?
Recent discussion comes to mind:

3 months later ? Sorry… I’ve just read your reply today… or some synapse of mine switched just now. And I hope you’re still around.

I’m pretty sure I’m being guilty of posting too early for my current understanding, but it rings too many bells, for me to not take notice now.

I’m far from a neurobiologist, and started to read about the field only recently, but I’ve been trying to understand the biological “details” once I got a sufficient grasp on the HTM model. Now, these days… I feel overwhelmed by the neuro papers & data.
So, for me, it is more of an increasingly pressing dread of those “number of complicated ideas” and details that I once cherished, which have triggered my interest to step-back a little, and try to tackle the problem from the side of… how do you say it… an “external optimizer”.

I won’t be able to follow algebra-heavy discussions about criticality, or even in which way it rules out an implementation using particular gates. But I intuitively sense that, say, “counting synapses and AP so that a stabilizing phenomenon attracts each cell to having roughly same number of inputs as outputs”, is going towards what I’m looking for.

If I may… here are a few ramblings of mine, from recent PM discussions:

any clue for how a mechanism rewarding “pertinent” information would work from map to map ? I believe deciding to detect “edge” in V1 is not inferred from a pure statistical property of real-world visual input, but from this combined to the higher-up information that “yeah, it helped make sense of something. Please go on.”

I can’t see online learning networks of we vertebrates able to operate without some mechanism rewarding not just statistical info, but also “pertinent” info. Can take the form of every cell wanting to be never underused, nor overused

  • they do and we need more wiring info that most papers would care to provide.
  • they don’t but dynamics and pertinence craving mechanisms are enough for synapses to functionally distinguish between them

My further hope would be to understand how such hardwired (as in insects) basic functionalities are converted to an implementation using self reinforced mechanisms (learning). My current bet is that there really is a “craving for pertinent info” mechanism, identifiable at the level of single cells, going beyond STDP. Like… STDP + “yeah thanks” from post syn cell influencing whether we ought to continue discriminating similar to current input or if we should try sensing something else entirely… And my hope then is that most of the tale would lie here : wire that anyhow and it should kinda work.

I’ll be focused on deciphering “the drive to populate those maps”. Maybe that’s all neurobiologists have ever done, and that phrasing is dumb. But, I dunno. I feel that there’s something to be found here from a viewpoint I was oblivious to before. And gut tells me it’s somewhere close to the origins.

Now, if you see SOC fitting precisely that role, or even if what I’m (poorly) trying to talk about is ringing any bell, please let me know :slight_smile:

regards,
Guillaume

Not @Jojker, but here is my take:

That exploration and evaluation is a mechanism, just a higher-order one. Nothing happens without a mechanism.

And once all maps are populated, by the end of adolescence at the latest, the brain stops learning?
There always is a displacement of lower-predictive-value old memories. “Drive”, “want”: these anthropomorphic terms are misleading when we are talking about simple low-level mechanisms.

All that neurons are doing is recognising patterns: coincident inputs. Strong patterns propagate upstream and laterally (perhaps in @Bitking’s hexagons), while also suppressing downstream: predictable / low-additive-value future inputs. Weak / suppressed nodes invite replacement inputs from neighbouring downstream areas. I think your basic “pertinence” is downstream reinforcement of novelty vs. predictability, overlayed on upstream-detected input pattern strength. With similar lateral mechanisms of inhibition among same-source (expected coincidence) co-activated nodes, and reverberation among different-source (unexpected coincidence) co-activated nodes.

1 Like

Not arguing that, of course.

I believe some maps are pretty stable at some point (eg V1) and there should be something encouraging lower level to stabilize somewhat before higher ones, imho, but having them “populated” wasnt meant to represent the end of learning. Sorry for those sentences without a context. This was a mix of several concerns, and we were talking (with @Bitking) about evolution and “adding more maps” to the cortical graph.

This is beginning to address my current interrogation. But I don’t know yet what to make of it.

“All that neurons are doing” can be taken somewhat dismissively, like okay, let’s move on and seek elsewhere an explanation for our “intelligence” musings. Or it can be seen as something more fundamental, like, here also lies such explanations (and we did not draw that conclusion yet precisely because we’re too proud to place such “simple low-level” mechanism at this position).

It is becoming clear to me that coincidence detection is great, but it aint great alone. We need something to outfocus both obvious and fortuitous ones. STDP explains some implementation for coincidence detection, ok, got that covered, and now I’m looking for the missing part of the story: pertinence.

  • SOC seems to address that by a global mechanism enforcing criticality. And maybe that’s all there is to it.
  • You seem to address that by a mechanism of downstream inhibition. And maybe it fits that role entirely.

It seems you’re more advanced in your understanding of that downstream inhib loop than I am. Had I studied further, I could decide whether that loop is indeed well understood already and we need to move on, or if there’s more work to be done to understand how it comes to be.

And then, as for the “all that neurons are doing” part, be clever enough to discern such mechanism as fundamental (and “sufficient”, in Jame’s sense), or not.

Cheers :slight_smile:

1 Like

Thanks. I think I have pretty good understanding of high-level function and architecture, but very fuzzy on computational level. No background or intense focus on neuroscience, my approach is not neuromorphic at all. I am trying to perform the same function: generalization, but via very different low-level mechanism. I am exploring neural stuff too, as a plan B, will try flesh-out details latter.