How can we build agents that keep learning from experience, quickly and efficiently, after their initial training? Here we take inspiration from the main mechanism of learning in biological brains: synaptic plasticity, carefully tuned by evolution to produce efficient lifelong learning. We show that plasticity, just like connection weights, can be optimized by gradient descent in large (millions of parameters) recurrent networks with Hebbian plastic connections. First, recurrent plastic networks with more than two million parameters can be trained to memorize and reconstruct sets of novel, high-dimensional 1000+ pixels natural images not seen during training. Crucially, traditional non-plastic recurrent networks fail to solve this task. Furthermore, trained plastic networks can also solve generic meta-learning tasks such as the Omniglot task, with competitive results and little parameter overhead. Finally, in reinforcement learning settings, plastic networks outperform a non-plastic equivalent in a maze exploration task. We conclude that differentiable plasticity may provide a powerful novel approach to the learning-to-learn problem.
May I add that in the biological model the plasticity is modified by the limbic system.
Memory modulation
“The amygdala is also involved in the modulation of memory consolidation. Following any learning event, the long-term memory for the event is not formed instantaneously. Rather, information regarding the event is slowly assimilated into long-term (potentially lifelong) storage over time, possibly via long-term potentiation. Recent studies suggest that the amygdala regulates memory consolidation in other brain regions. Also, fear conditioning, a type of memory that is impaired following amygdala damage, is mediated in part by long-term potentiation”
That means that LTP/LTD are modified by limbic system?
Seems to me we first need to have them have ongoing/constant interaction with novel stimuli. Secondly, we need a universal encoder that can take input from many different kinds of sensors. And if the prowess of an AGI is a function of and who’s sophistication grows with time - we’re all set? (HTMs already have plasticity)
Perhaps the input (to the cortex) is not only encoder dependent. Part of the elements that are in the brainstem are mixing sensory input with cortex prediction to produce new cortex inputs. Seems like the proper way to deal not only with noise but to “converge” the new stimuli to the already known (and save valuable resources).
Some times this prediction mixing might affecting to the input in a funny way: