I don’t see the connection myself. From what I can tell, your daisy chain concept is about holding onto a (presumably unmodified) signal for an extended period of time by passing it around in a loop of connections like the game “hot potato”.
But that isn’t how I interpret the image in Numenta’s article:
Now granted I am no neuroscientist, but this doesn’t look like it is designed for holding onto a signal to me. The thalamus section looks more like a two-level hierarchy which is driven by activity in cortical layer 6 combined with neighboring activity in the thalamus.
Assuming this circuit is related to attention (I recall from previous conversations that cortical layer 6 and the thalamus are involved in attention), my naive interpretation of this circuit is that layer 6 is informing the thalamus what its region is currently attending to, and that input gets combined and abstracted with other activity outside of the region (from this diagram can’t really tell if that is activity from other cortical regions or from subcortical structures) in order to ultimately provide more weight to some subset of the input, thus influencing what is being attended to.
I should point out that I haven’t had a chance to read and absorb the materials on the thalamus that @Bitking has posed on previous threads, so above is a very uninformed interpretation. I’m mainly making the point that there isn’t any obvious correlation between this and the daisy chain concept.
@keghn_feem - There are a couple of ways making a memory device with ANN. One way is to daisy
chain of neurons in a loop. Also known as a delay line, that feed back in. Like a RNN. But it is a pure transmission line and data is not changed.
I am liked the concept of using a daisy chain of nerves in a loop to hold a pules. Just like mercury memory delay lines like in the days of old.
I know of mercury delay lines as memory; some of the older mainframes that I am familiar with used these. There is no analog to this in neural hardware. There is considerable delay in long nerves such as from the feet to the head but this is not used as any sort of memory.
When it comes to neural hardware one of the more common mechanisms proposed to hold short-term working memories is reverberant connections within a single layer or between maps. This paper shows examples of these methods: Synaptic reverberation underlying mnemonic persistent activity http://www.cns.nyu.edu/wanglab/publications/pdf/wang2001a.pdf
There are also some synapses that have been measured to have activation time in minutes once triggered but I would have to dig in my archives to get a reference to those.
Still working on it; not ready for public viewing yet.
But RSN (Real Soon Now!)
I think it is going to irritate a lot of people and I want it solid so I don’t waste my time arguing over the poorly implemented details.
I have a bit of a naive observation here. I have not had much time to look deeper into the thalamus and its numerous interactions with the neocortex, but upon first impression it kinda seems to me like the thalamus is performing the role of spatial (and maybe temporal) pooling.
It sits between the sensory inputs (or inputs from other regions) and the cortical columns responsible for the temporal memory encoding.
The thalamic relay neurons operate in tonic and bursting modes which seem capable of both moderating and distributing sub-cortical activation patterns for expected stimulus and registering surprise for unexpected inputs.
Up to this point, I had been viewing the spatial pooler algorithms as a necessary artifact that assist with the translation of the primitive activation patterns from the encoding process into a proper SDR pattern ready for consumption by the cortical columns of the temporal sequence memory. If my reading of this article is correct, then the thalamus itself (or perhaps some even lower structure) may be performing this function.
I realize this is probably an oversimplification, but does anyone else see this as a potential correlation?
I have been touting the “three visual streams” paper for a while now.
The Deepleabra model does about as good of a job as any in showing how the thalamus interacts with the cortex, with the “three streams” paper being a masterwork for showing the model doing useful work.
The Thalamus and layer 5 do the heavy lifting in the feedback direction.
The Deepleabra predictive model is compatible with the HTM model and I can see them co-existing in the same system. In fact - there is much to like about two entirely different predictive models working together; I think that Mr. O’Reilly could learn a thing or two from the HTM model in the feed-forward direction.
It is a hard read but if you make it through you should have a good idea of how the “double-ended” hierarchy works to guide learning for a rapid convergence to an internal model.
How about the RAS? The ties to the the nest of thalamic nuclei? The other sub-cortical structures?
Where do you stop? It is all tied together into bowl of gooey spaghetti-like connections.
In the end these other systems will have to be addressed but it may be too much to expect that right out of the gate.
Part of why my own work is proceeding so slowly is that I am trying to tie together several models into a working whole. Connecting a subset of the whole requires “filling in the blanks” to make up for the bits that I am not incorporating; some of these band-aids get pretty sketchy.
I do see some value in trying to abstract some minimal stand-alone systems and work out how they function.
Thanks @vpuente. Being a visual person myself, I tend to absorb information from videos a lot easier. The process they described looks like a good candidate for addressing the “switching between objects without a reset” problem that has been discussed on some other threads. You can basically blast the new context when it appears in the input stream to quickly switch from one object to another, then resume normal operation.
The RAC/RAS has been postulated as the “Searchlight of Attention” since this paper by F. Crick:
If you google the term “searchlight of attention” you can see that many papers since this one explore this concept (pro and con) and add considerable insight to the workings of this structure.
It is certainly an important adjunct to the interplay between the Thalamus and Cortex - I am not sure if the attention function should be a stand-alone feature apart from the data path or an inseparably part of function of these structures.
My personal take runs something like this; I see that the thalamus does at least two different functions in relation to the cortex.
One is closely related to the feedback path in the general direction through various hierarchies as described in the first post above.
There is nothing controversial here. I will add that this path is in parallel with the direct corticocortical pathways. I see the cortico-thalamic-cortical loop as being a command and control pathway where the cortico-cortical pathway is an information pathway.
The second function is also a control function but with a very different purpose - to activate or coordinate the basic information processing method of the cortex. It would be very wasteful to activate large swaths of the cortex where this is no information to be processed. The signal that something is to be recognized and/or learned would be a widespread “surprise” signal from local areas of the cortex. I see that this would trigger the start of traveling waves in that area of the cortex. Due to the map-spanning connections of the thalamus this same C&C function would be relayed to corresponding areas of related maps.
You may have seen this before but I invite you to read it again in light of this discussion:
With these thoughts in mind - the RAS is positioned to compare the (relatively) local activity between the cortex and thalamus. Given the two functions I have outlined above - and assuming that form follows function - where would this fit in for comparing the activity levels of the two structures?
One uses bursting to signal surprise, one uses bursting to jump-start activity. The RAS is looking for something and bursting seems like a very easy to detect to signal that could be sensed. What if it is simply acting to equalize activity between the two structures? In the process it would act to detect surprise and activate processing on the related information pathway, all using simple and relatively local functions.
This is agnostic as the WHAT/WHERE functions and is instead - a cortex wide processing method. Please note: Hierarchy/pathways is established here by genetic programming and is outside the scope of this discussion.
I am certainly into computers and have been programming for many years but on this forum everything should be based on the biology of the brain - that is the mission here. We try to restrict our hardware to the kind of processing that can be found in the brain.
The brain does not have the usual tools used in classic AI work. There have been many other methods tried (LISP/Symbolic AI, expert systems, heuristic systems, reasoning systems, block worlds, knowledge based systems [frames and scripts], decision trees, propositional and first order logic, inference engines, the list goes on and on) to make an AI but that is not what we do here.
We have “encoders” to simulate the transformations in the early sensory system to formats compatible with processing suspected to be performed by the brain.
I do my non-biological processing on other forums - as should you.
You are correct that I have posted most of my main ideas over the last year: this is the distilled results of study of the biology of the brain since the late 70’s. My main ideas are moving forward slowly as I refine some of the “iffy” areas; there are some sub-cortical areas that are still a complete mystery to me. It is unlikely that I will be adding any bold new concepts to the central big picture any time soon; mostly at this point I am tweaking around the edges.
My main focus now is to reduce these concepts to working programs.
I think that the thalamus is as closely related to the basal ganglia as it is to the cortex. The basal ganglia uses reinforcement learning to predict when the animal will receive rewards or penalties. Animals use the basal ganglia’s predictions to attempt to maximize their cumulative rewards, which drives behaviour. The basal ganglia however does not directly connect to the muscles which control behaviours, instead the cortex connects to the muscles. The thalamus is the major pathway from the basal ganglia to the cortex, and therefore is at the interface between unsupervised and reinforcement learning.
I hypothesize that the function of the thalamus is to control the cortex, with the goal of maximizing the animals cumulative rewards.
I think that TRN is just a coincidence detector (i.e. a comparator) between L5 and L6 projections. Note that at very low level (such as auditory cortex A1) there is no notion of objects. Just frequency changes.
BTW (in my humble opinion) I think the pursuit of vision (or any other motor involved sensor) as the main goal is introducing a lot of unnecessary difficulties to understand this. Any “cognitive-level” consideration is way above the bottom of the hierarchy (perhaps 10s of levels). I think that the principles at the bottom are the same as that at the top, but “discover” those principles from “top” observations seem pretty hard to me.
By “object” above I was referring to the activity in the “output layer”. Numenta’s view is that object representations exist in all hierarchical levels.
Of course the highly abstract concepts are still going to be higher in the hierarchy as you would expect. But HTM proposes that even the very lowest levels are capable of doing a lot more than traditionally thought.
BTW, for reference (I just realized my previous post lacked context), I was referring to the process of “reseting the output layer when switching to a new object” from the Columns Paper. The circuit described in that video (or something similar to it) seems like a good candidate for triggering a “reset”.
Nevertheless, I have some doubts about that view. Seems quite cost-inefficient to do so; you will use multiple synapses across the hierarchy to store information from the same high-level object? Seems more synapse-efficient to collapse as much as possible common information from different objects in lower levels and produce the disambiguation across you move up and use local redundancy in synapses to increase resilence.