How does short term memory work?


Numenta posted this on Linkedin:
“In an ongoing debate over how working memory functions within the brain, two dueling papers in this edition of the Journal of Neuroscience differ over how they model the process. The central argument is what happens after you hear or see what you need to remember and must then hold it in mind for later. During that interim period, do neurons in the prefrontal cortex maintain it by persistently firing, or do they spike bursts to transfer information via the patterns of their connections? One of the researchers noted that “We need more experiments to settle this. They will give us not only more insight into this question of persistence but also about working memory function.”

This seems like one of the most basic questions about how the brain works.


The HTM output layer gives a third option, although it also uses persistent activity, since each input reduces which neurons are on. Maybe working memory is based on which neurons can or cannot fire.


“Is there any neuroscience evidence that STM is transferred to LTM?”
I originally wanted to post this question.

if there is no evidence that memory move from one place to another, STM and LTM may work on the same principle, they are distributed in various places on the cortex.

I suggest forum add a root neuroscience catgory.


It may not “move” but there is considerable research to show that the contents of the hippocampus is played back to the attached cortex during REM sleep. This process is absolutely necessary for episodic/long term memory to form in the cortex.
Look up “patient HM” on google.


If I’m designing STM I’d leave a chemical trace from a temporal node representative of approximate time since last STM to LTM burn in that lights up clusters of things that occurred around that time… there would also be chemical traces as to whether or not specific paths were related to positive outcomes or negative outcomes.

Then during burn in, I shift weights for other network types and clean out the STM traces.

The consequence of a system like that is that if you take too long between burn ins, you end up with STM soup all over everything. It also means that if I can force clean up of STM traces, I can blank out that day’s STM. In either case, you fail to learn anything new.

The other interesting consequence of a system like this is that it might make it challenging to remember STM constructs in the order they occurred without using prior learning to force the order.


Any more research about STM and LTM except hippocampus and HM?


Vast amounts.
My research notes on the hippocampus include (in no particular order):
A Dual Role for Hippocampal Replay
A Neural Network Model of the Corico-Hippocampal Interplay - Contexts and Generalization
A neural-level model of spatial memory and imagery
Bunches of things on pathways and connections
Bunches of things on waves and spike timing related papers
Computational Models of Grid Cells
Context prediction analysis and episodic memory
Coordinated Excitation and Inhibition of Prefrontal Ensembles during Awake Hippocampal Sharp-Wave Ripple Events
Dendritic mechanisms of hippocampal place field formation
Dynamic Reconfiguration of Hippocampal Interneuron Circuits during Spatial Learning
Experience-dependent firing rate remapping generates directional selectivity in hippocampal place cells
High Dimensional Nonlinear Learning using Local Coordinate Coding
Hippocampal Attractor Dynamics Predict Memory-Based Decision Making
Hippocampal Auto-Associative Memory
Hippocampal coupling with cortical and subcortical structures in the context of memory consolidation
Hippocampal Neurogenesis Regulates Forgetting During Adulthood and Infancy
Hippocampal Offline Reactivation Consolidates Recently Formed Cell Assembly Patterns during Sharp Wave-Ripples
Hippocampal Representation of Related and Opposing Memories Develop within Distinct, Hierarchically Organized Neural Schemas
Hippocampal signatures of episodic memory - evidence from single-unit recording studies

This is a tiny fraction of my notes on papers related to this topic.

The one-day sized buffer that is the hippocampus is critical to understanding long term memory formation so I would start your studies there.

This is a huge topic in the study of the brain and memories. I can point you to specific things I have seen but I really don’t have a royal road to understanding the complex systems that make up the various hierarchies of memory systems at work in the brain. It has taken me years to get a hazy understanding of the big picture on how this all works; I don’t expect it to be much faster for anyone else.


At first glance these two mechanisms dont seem contradictory or mutually exclusive. Persistent activity has the advantage that its easier for other neurons to recognize a static pattern than a constantly changing one. Maintaining information via synaptic connections has the advantage that the information can be updated as needed. However I have not yet read the articles so i could be wrong.


Some light reading on the topic.


You did ask about LT/ST outside of the hippocampus and I did not answer your question directly but pointed you to the central nature of the entorhinal cortex / hippocampal formation (EC/HF) in the memory systems.

I feel like I left your question less than half answered.

There are several interrelated time scales in the memory system.

The shortest is actual perception.

Say what?

Yes - perception is an active recall process. In the first part of perception, cells in the cortex and thalamus have some raw unrecognized perception filtering in. If this is matching some prior memorized pattern fragments, mini-columns fire in matching single cell bursts. This excites the thalamus in the tonic mode and there is a spreading of activation to surrounding parts of the map. At the same time the pattern of recurrent thalamus-cortical resonance is started.

The thalamus is interesting in that it also has connections between maps - activity in one map is passed on to prime certain surrounding maps for action. Think of this as a control signal. The reason that this is interesting is because there are matching inter-map connections between these maps from L2/3 layers; bi-directional connections that carry data. As the sensed perception is recognized this coordinated action serves to spread recognition and activation out like ripples in a pond.

BTW: The linked article on the top of this post discuss the low-level mechanism of this activation, with a focus on certain parts of the forebrain as plans are formed and elaborated. We here in the HTM community have a somewhat different view of how to interpret these findings

This recognition is never exact. In the parts of the maps where there is excitation from a strong sensation or the surrounding thalamus - but no match - the mini-column are bursting in all cells in the mini-column. This blast of signal fires the Reticular Activating Complex (RAC) and cause it to gate in more of the novel sensed pattern. (This is distinctly different than tonic mode) This is combined with the training signaling at the mini-column level causing the unrecognized pattern to be learned rapidly. I have referred to this elsewhere as the delta to your recalled memories; I like to think of this as icing being added to the recalled cake.

As you add more details in you are likely to change these memories in two ways - the pattern gains more detail about what a given pattern might be (same face from different angles or different lighting) and become more general; the other is the pattern “splits” into two distinct patterns. (different faces)

In either case - these patterns run up the hierarchy to the association areas where they form higher level object recognition and the fusion of the various senses.

At the same time other paths in the brain extract the WHAT and the WHERE of the objects and spaces being sensed. This all eventually collects in the temporal lobe as your perception of “now.”

This time scale for all this is in the range of a few alpha cycles - say 100-500 ms.

This brain-wide activation is fairly persistent and may chain from one moment to the next moment. Elsewhere I have offered how this is experienced as consciousness.

This is whole ensemble of recognition is also your short term memory. You are remembering “here and now” at long as this activity pattern is maintained. This can be from 100 ms to 3 minutes or so.

As this happens this episodic memory is experienced in the temporal lobe and through that, the hippocampus. If I understand the EC/HF correctly it memorizes faster than the cortex but only has a limited capacity - perhaps only a day or so.

As the episodic memories are laid down in the hippocampus formation the amygdala may add good/bad flavors to these memories based on the emotional perception of the outcomes of the experience.

This is a “one-day sized” buffer of collected short term sequences, with emotions embedded in the memory.

So how does this get out of the hippocampus and back to the cortex?


When you sleep your brain is “disconnected” from the body and special synchronization waves called spindles drive the hippocampus to replay the newly learned experience back at the cortex.
This spreads via the L2/3 connection and thalamus connections. I believe that the cortex is a slow learner so this experience is repeated rapidly until the cortex learns it well enough to play it back the same way it does when responding to outside experienced sensations. When the cortex and the hippocampus respond to the synchronization wave equally the training is complete.

Now this learned pattern has been transferred to long term memory and is available as patterns for perception of new experiences

Slightly more detail on the sleep/memory consolidation thing:


I really like your explanation here.

I have been thinking about building a deep learning model that tries to simulate the hippocampus growing new connections to learn quickly. I think it would be fun to play around with different loss functions and connection rules to see if I can find something that works well.

Is there a good resource on any of the dynamics of how hippocampal neurons choose to connect?

Thanks for the discussion.