A Review of Learning Rules in Machine Learning - March 8, 2021

clai · March 10, 2021, 11:27pm

In this research meeting, our research intern Alex Cuozzo reviews some notable papers and explains high level concepts related to learning rules in machine learning. Moving away from backpropagation with gradient descent, he talks about various attempts at biologically plausible learning regimes which avoid the weight transport problem and use only local information at the neuron level. He then moves on to discuss work which infers a learning rule from weight updates, and further work using machine learning to create novel optimizers and local learning rules.

Papers / Talks mentioned (in order of presentation):

“Random synaptic feedback weights support error backpropagation for deep learning” by Lillicrap et al.: Random synaptic feedback weights support error backpropagation for deep learning | Nature Communications
Talk: A Theoretical Framework for Target Propagation: Talk: A Theoretical Framework for Target Propagation - YouTube
“Decoupled Neural Interfaces using Synthetic Gradients” by DeepMind: [1608.05343] Decoupled Neural Interfaces using Synthetic Gradients
Talk: Brains@Bay Meetup (Rafal Bogacz) : Brains@Bay Meetup - Alternatives to Backpropagation in Neural Networks (Nov 18, 2020) - YouTube
“Predictive Coding Approximates Backprop along Arbitrary Computation Graphs” by Millidge et al: [2006.04182] Predictive Coding Approximates Backprop along Arbitrary Computation Graphs
“Identifying Learning Rules From Neural Network Observables” by Nayebi et al: [2010.11765] Identifying Learning Rules From Neural Network Observables
“Learning to learn by gradient descent by gradient descent” by Andrychowicz et al: [1606.04474] Learning to learn by gradient descent by gradient descent
“On the Search for New Learning Rules for ANNs” by Bengio et al: https://www.researchgate.net/publication/225532233_On_the_Search_for_New_Learning_Rules_for_ANNs
“Learning a Synaptic Learning Rule” by Bengio et al: https://www.researchgate.net/publication/2383035_Learning_a_Synaptic_Learning_Rule
“Evolution and design of distributed learning rules” by Runarsson et al: Evolution and design of distributed learning rules | IEEE Conference Publication | IEEE Xplore
“The evolution of a generalized neural learning rule” by Orchard et al: The evolution of a generalized neural learning rule | IEEE Conference Publication | IEEE Xplore

khanh_soo · March 11, 2021, 9:55pm

You can evolve even large nets. Eg. A simple mutation based system like Continuous Gray Code optimization will work.
Why bother when there is backpropagation?:
There are 2 answers, generalization and federated learning.
You get much better generalization because the full data set is used at each training step, not batches as with BP.
If you had 1 million images and 1000 CPU cores (250 raspberry pi 4s for example), then each core gets 1000 images and the full neural model. Each core is sent the same short sparse list of mutations to make to the neural model and returns the cost for its part of the training data.
The costs are summed and if an improvement an accept mutations message is sent to each core else a reject message.
The cost is less than 2 high end GPUs and the power consumpion about 5kwh. Which is about $1 an hour in most places, $75 in Texas.
Of course it would help to use fast networks like sparse ones or other special kinds.

Ed_Pell · March 15, 2021, 8:06pm

What about apical dendrites that receive backward propagating information? Maybe a good signal to tell distal dendrites now is a good time to form a coincidence detector.

The question of NOW is not clear to me. How narrow a window we have to learn what is good? My guess is biology allows approximate now.

clai · April 2, 2021, 7:34pm

Hello! We started answering HTM forum questions at the beginning of our weekly research meetings. You can find Alex and Subutai’s response here: https://youtu.be/UbxcyLqLfc0?t=34 (start at 0:34)

Here are some suggested papers that might be helpful:
• Towards Deep Learning with Segregated Dendrites
• Dendritic solutions to the credit assignment problem

Topic		Replies	Views
Unsupervised learning by competing hidden units Lounge	0	490	April 15, 2019
[Paper] Activation Relaxation: A Local Dynamical Approximation to Backpropagation in the Brain Related Papers	1	642	October 13, 2020
Signal Propagation: A Framework for Learning and Inference In a Forward Pass Machine Learning	3	644	January 25, 2023
Numenta Research on 3 Visual Stream & Deep Predictive Learning Current Research hierarchy , vision , deep-learning , neocortex , thalamus	3	754	January 27, 2020
Brains@Bay Alternatives to Backpropagation - Nov 18 9:30 am Pacific Time Talks and Events brainsatbay	2	770	November 19, 2020

A Review of Learning Rules in Machine Learning - March 8, 2021

Related topics