Today I realized something I hadn’t noticed before. Tell me if this makes sense or if I’m confused.
We talk about a hierarchy in the neocortex where columns send data to a region of neocortex that is above them in the hierarchy. In otherwords, the regions of neocortex are networked in a highly hierarchical fashion right?
That’s one type of hierarchy - regions hierarchy (which is technically a network but arranged in a hierarchy).
I realized today that there is another type of hierarchy - size hierarchy. A hierarchy of scale.
Neurons make up columns, columns make up regions, regions make up brains.
At each level, you have a network where the nodes of the network have many very close connections and few faraway connections. the local connections can be seen as a node in a higher size-scale network. Long-range connections can be seen as connections between nodes in the network above.
What you have is a network of networks - or - the realization that every node in any network is a network itself.
this drawing is very centric to the center red node to show long-range connections in yellow, midrange connections in light green and close connections in dark green.
I think you can abstract that scale hierarchy to look like this where each level represents the entire network at that particular scale made up of nodes of networks of nodes, of networks of nodes, etc.
Anyway. Is this scale hierarchy the same as the region hierarchy? it seems like no. It seems like the top level network (region) is arranged as a hierarchy itself. Is that right?
Whereas on the other end of the spectrum, the network of neurons inside a cortical column is more of a flat network, not arranged in a hierarchy. Is that correct?
Anyway, enough of my scribbles. is this the right way to think about it?
I have been reading about it in relation to the brain about as long as I have been studying biology. The bulk of what I have read anthropomorphizes this in the shape of how humans break down tasks with certain common organizing motifs. This seemed natural and I accepted it without question. You can see that I am not alone in this - older Numenta papers featured hierarchy so prominently that is is part of the name of the HTM model.
See page 9.
As I learned more about how the brain was organized I read thousands of papers and books (yes - thousands!) I started to have my doubts. Many facts and observations just did not fit this model.
I have tried to keep an open mind about how this or that model “almost” works and maybe I have one or two thing wrong about it and with a little fixing up it would be just fine. Some models start to accumulate so much cruft that you just have to put them on the shelf and quit paying any attention to them. The concept that hierarchy concentrates or consolidates to some higher order representation fits in that crufty niche for me now.
I have come to this point though several different lines. One particularly notable line is developmental biology. What kind of operations does the growing brain use to turn from a few cells into hundreds of million cells all hooked up the right way? This is all done without a schematic or wiring diagram. Check out a paper that Numenta has touted many times - in it V B Mountcastle gives a nice overview of how the cells in the cortex hook themselves up as the cortex grows. Columnar organization of the neocortex - V B Mountcastle https://academic.oup.com/brain/article/120/4/701/372118
See the section “The ontogenesis of the neocortex”
Cells do basic operation like migrating in waves and spreading out laterally. They can split into different types. They generate chemical markers to use as guiding scaffolds in place of schematics and road maps. These chemical markers can be gradients across a map. They can be a 2D address with markers with gradients going “top to bottom and left to right” in a map. (direction markers being an arbitrary definition here) They can make stripes and spots. See zebras and leopards for examples of this. One of the more astounding feats is the generation of inter-map axonal projections. This is roughly the same as your finger growing miles long and in the process, ending up at the right house correctly positioned for pushing the doorbell - without looking. The chemical markers are guiding the process with the growth cone on the axon sampling the chemical markers in the soup and sniffing out the target that has the exact same x/y smell as the cell that they axon started from in the target map. As it grows it add to the correct side to steer in the direction that smells right. What it does NOT have is the right details to form a complicated wiring diagram - it is part of a roughly parallel bundle of axons growing as a fleet to a target position in a distant map. (cue “flight of the Valkyries!”)
The last hurrah of the hierarchy for me was trying to use the “crazy quilting” and slight spreading of axons to pull off some sort of hierarchy concentration as the information ascends the maps.
In the discussion that followed and reading many papers on the loose ends that came up in these exchanges, while there are discontinuities, they are just that - scattered maps that do maintain topology in the multiple scattered copies of the input topology located all over the brain.
There are other lines of factoids that cast doubt on the concentrating of information as you ascend the hierarchy but I consider the developmental facts to be the most damming.
So what do the maps do with these bundles of parallel fibers? I put down some of the key factors in this post:
Which results in this schematic diagram showing the relative relationships of SDR size to spatial mapping size:
It makes little or no sense that the flow is a straight line from V1 to the association regions. There would be very little for the dendrites in the next map to use to form new SDRs; they are too far apart for even two fibers to reach a single dendrite - let alone enough to form an SDR. What must be happening is that the projecting axons from different maps come together to form new SDRs that are a combination of the output from these other maps.
Also - we have neurologists that have been giving us detailed connection maps for decades and it seems like every map is connected to many other maps. We trot this diagram out to frighten neural newbies:
On of the basic truths of brain is that if information will be used in a certain part of the brain there will have be be a direct or indirect connection from where that information is sensed.The good news here is that almost everything is connected to everything through a low number of hops - sort of a "six degrees of Keven Bacon" thing.
I do see the information being compared and contrasted as it splits apart and rejoins going up the hierarchy. I see the maps teasing out as many relationships as possible and presenting that to the association region to form codes representing objects.
This is the essence of my hex-grid coding scheme and where I think that Numenta is going with the 1000 brains theory.
I have a feeling that this is personal. In some brains, experience does converge into recognizable conceptual hierarchy. In others, you have local hierarchies, but they stay largely separate. No links, just a personal observation :).
There is learning on all levels of structure, we are not insects. Higher-in-scale and lower-in-the-pipeline structures are more genetically influenced, but nothing is set in stone. Ok, that might be an exaggeration, to compensate for your exaggeration :). But even prenatal development is adaptive, which is a form of learning.
Anyway, effective structure is a composite of all levels, so learning in finer structures alters function and interaction of more coarse ones. For example, we all have two hemispheres, but the degree of lateralization generally increases during development, to the extent that differs between individuals. And lateralization, to whatever extent, is the top component of generalisation hierarchy.
So, large areas and their fiber tracts maybe mostly genetically determined, but hierarchical relationship between them is informed by learning. I assume that specific short-term pressures, or innate sensitivity to such pressures, would increase detail-oriented learning at the cost of broad generalizations.
I’m not sure if you are disagreeing with me or you just don’t like the way I say things.
I think we are in agreement on the gross structure of maps and fiber bundles. I think we agree that the function is strongly dictated by the learned connections at the terminus of these bundles. We agree that this learning is distributed in the maps from the primary sensory areas through to the lobe hubs or as they are often called - association regions.
In a prior post on the definition of intelligence I stated that the training absorbed combined with the gifts of genetics act to select the best actions in various settings. more intelligent people tend to pick what is thought to be better answers. I think we are in general agreement even though we are saying this in very different ways.
The original post asked about the hierarchical organization and my intent is that it is not what is normally thought of as a hierarchy but instead - a parallel data flow where the stream splits and rejoins to extract features of the stream. The representation maintains topology through this process. What is presented to the association region is a feature rich topological aligned stream that can be sampled for space-time features that add up to objects and spatial relationships. This is still in a distributed form that occupies some spatial extent of the association region and my be morphing over time.
In the case of the eye (processing of which consumes roughly 30% of the real estate of the cortex) the presented features strongly exploit the temporal nature of the processing to collect the topological organized stream of saccades to match up features from frame to frame for object recognition. Still not really what could be thought of as a spatial hierarchy although a case could be made for some kind of temporal hierarchy during recognition.
Basically, you stress the negative: NOT all / most relationships are hierarchical, and then it’s NOT what “conventionally” is thought of as hierarchy.
I stress the positive: upstream areas contain at least as or more general, AKA stable, AKA invariant representations. (Edit: they are projected be as or more general according to prior experience, but may not turn out to be such in the future)
Which can only be discovered by expanding effective receptive field of these representations. It may expand across space or over time (the difference is often POV: one maps to another, as with your eye saccades), and this expansion is selective vs. contiguous in various ways.
But it’s not a question of hierarchy or not, just a shape and degree of deep vs. flat in that hierarchy.
Generalization in the cortex is like evolution in biology: it may not be consistent in form or direction, but nothing makes any sense without it.
Do you feel that this definition is informative?
My intent here was to stress individual variation in width vs. depth of hierarchical organization. I feel that lot of theoretical disagreements come down to mismatch between these parameters in the participants :).
This is what happens with you summarize an idea with a sentence that hits the key point - it is often weak and misses the supporting ideas. You know that I am not afraid to dump a wall of text to support an idea - just not every time I mention it in passing.
We have a long winded thread on intelligence vs consciousness and it is not topical to the original post - if you want to continue this idea I suggest that we switch to that thread but a short response is this:
I do.
In that thread there were several attempts to examine exactly what intelligence is. Most were a laundry list of traits that assumes that if you have these traits - you are intelligent. As you correctly stated above - intelligence is a continuum so you end up with a this multi-variate input function yielding a slope with no clear relationship between the factors in the definition and the result. It really misses defining the core of what intelligence is.
The definition I offer is simple and gives a general metric based on the concept that the purpose of having a nervous system is to select and perform some action based on perception. If you think about most measures of intelligence you can boil them down to measuring this exact thing. There may be internal factors that drive this selection of action but in the end - smart people understand what they are seeing and make better choices.
I am scared shitless of any philosophical discussion about “consciousness”.
It’s a purest form of intellectual porn.
I disagree, respectfully :).
I am sure you’ve seen my definition, but it didn’t strike a cord:
“Intelligence is a general cognitive ability, ultimately an ability to predict. That includes cognitive component of action: planning is technically self-prediction. And prediction is interactive projection of previously discovered patterns. This perspective is well established, pattern recognition is a core of any IQ test. But there is no general and constructive definition of pattern and recognition. Below, I define (quantify) similarity for the simplest inputs, then describe hierarchically recursive algorithm to search for patterns in incrementally complex inputs…”: http://www.cognitivealgorithm.info
This actually does give a simple metric, which can be applied automatically on the lowest levels of processing. As I understand, your metric is a human judge observing some complex behaviour and applying a back-box load of implicit subjective criteria, case by case.
But it’s very to argue on this level of generalization…
This fight is an uphill battle. Literally the majority of the machine learning and neuroscience communities still thing of the hierarchy in the classical way.
As you go up the hierarchy, each region still gets sensory input, but it represents less detail across a larger “field of view”. This means higher levels can identify very large objects before the lower levels and inform them so they can resolve their representations, allowing them to “learn the right object” as they inspect it.
You can build a hierarchy from any point in cell space. Just look at one neuron’s connections (a map of maps). BOOM! hierarchy.
Looking at it this way, there are many hierarchies of “scale” as you say, depending on which layer you probe and which type of cell you hit. This hierarchy is just cell connectivity, and if you look around you in the world, you can observe hierarchy simply by picking some object and attempting to classify it’s parts and those parts’ parts etc. I think this is the type of hierarchy you are talking about.
I don’t think this is the same as the hierarchy neuroscientists talk about (although it absolutely is the type of hierarchy Deep Learning implements).
From my perspective, the main difference between classical hierarchy (used by DL implementations) and cortical hierarchy is the fact that abstraction is formed within the cortical column itself (not in the connections between cortical columns). This enables things like level skipping, reciprocal connections, or even (theoretically) depth-independent recursive hierarchies that you could never do in the classical tree-like structure. You still are building up abstractions the further you travel the pipeline, though, so it isn’t completely different from a traditional hierarchy from that perspective.
I hate to beat a dead horse but I will take another whack at this one.
The “negative attitude” about hierarchy is a direct attack on this classic Christmas tree of concentration as one ascends the hierarchy:
This hides the Christmas tree in each of the streams but implies that there is some sort of orderly progression from input to output, and that there is some high-level version that can be combined.
What I see after years of tracing out data paths is that everything is connected to everything and the connections are broad copies of each map to the next:
(Not the actual map, look at post #2 above for an example)
The WHAT and WHERE streams do eventually converge at the HC/EC complex, but everything in between is essentially a haystack of connections. Note that the processing at almost every map in the cortex is the convergence of two or more maps. I see that everything between the sensory cortex and the association regions is some sort of compare/contrast operation. After all these years I still have not been able to put my finger on an exact description of what the maps are doing with these inputs other than saying it is extracting features.
I guess that’s addressed to me. I never proposed “classical” hierarchy, it ignores temporal accumulation and discontinuities that result from pruning the search tree. I was talking about incremental generalization and composition, it has to be driven by search expansion in both space and time.
BTW, I think what and where converge in other places too. Specifically in inferior parietal, and indirectly in dlPFC
The classical description of hierarchy seems more useful if you are interested in things like anatomy and medicine, since it’s an approximation of connectivity. I think a lot of us, at least myself, get pretty annoyed by how neuroscience works because neuroscientists are more concerned with healthcare than AI.
I think the less detailed sensory input to non-primary regions is often like a different submodality. If your sensors describe a bunch of different things about the object, some of those things won’t be as spatially fine-grained as others. That can make it hard to tell whether a region is primary or higher order - does it receive direct sensory input because of level skipping, or because it processes a different aspect of the sense?
Some sensory submodalities arguably skip the first level of the hierarchy*. I think the targeted cortical region is still higher order if it doesn’t receive input from pathways directed up the hierarchy. At least, it plays the same role as a higher order region in the thousand brains theory. Maybe it’s better to talk about scale than hierarchy.
*E.g. a couple whisker pathways (VPM head and tail) only target whisker S1 in the septal columns and dysgranular zone (maybe specifically L4, I don’t remember), which are higher order if you define hierarchy by the CTC pathway and take the results from a few papers for granted. The thalamocortical axons spill over into the barrels but those might target distal dendrites of cells outside the barrels. As another example, some types of koniocellular cells in LGN might not target L3/4 of V1. I might be remembering wrong.
Let’s talk about both! Looking at a very large item that takes up your whole Field of View would depend heavily on higher levels of hierarchy, which get larger FOV from the sensory space. V1 is looking at very small FOV (like a straw).
Imagine an elephant standing a few paces in front of you. V1 might not have any columns that can identify the object, even with very high details of small areas. But higher levels will get bigger scale inputs, and be able to match objects it has seen at that scale. In this case, voting from the high levels of hierarchy can inform the lower levels on what the object is. But you can still recognize elephants in V1, they just have to be on the horizon, basically very small. Then V1 can inform your higher Visual levels what that hazy grey spot is.