Lack of Biological Correlation: Are these NN Flaws still valid?

cogmission · May 22, 2016, 10:44am

Read here and decide?

http://www.i-programmer.info/news/105-artificial-intelligence/7352-the-flaw-lurking-in-every-deep-neural-net.html

cogmission · May 26, 2016, 12:56pm

Does anyone who’s familiar with classical NN tech know if these limitations reflect the current state of NNs?

Bhaskar_Dutta · May 26, 2016, 5:06pm

Yes as far as I know adversarial examples have a certain universality to them. They have been attributed to linearity in NN,and at very high dimension spaces, the volume tends to concentrated at the surface, so is relatively easy to find such examples. Moreover it has been seen that adversarial examples have generality as such that they are independent of architecture or training data.

Please see paper below…

Sean_O_Connor · August 14, 2016, 3:10am

The current deep neural networks are superhuman. That is because they use non-local learning.
Whereas humans and Hopfield nets are stuck with local learning at each neuron.
The 2 minute papers channel on youtube does an excellent job of showcasing the current state of the art with deep nets: https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg
To do the non-local learning you surely need multiple GPU’s to allow experimentation within a reasonable time frame.
The alternative is local learning (which seems to me synonymous with attractor state learning.)
That can be much faster. The is also the idea that maybe you can elevate the Hopfield network one step above what is possible in the human brain by using decision trees combined with ID3 as a reworked type of neuron. Anyway there is some relatively recent work that improves the learning algorithm for conventional Hopfield networks. It uses a minimum probability flow training algorithm:
https://arxiv.org/abs/1411.4625
https://arxiv.org/abs/1204.2916

Sean_O_Connor · August 15, 2016, 12:15am

Anyway Numenta should get its head around random projections. The papers out there are horrifyingly mathematical. However the basics are simple to conceptualize. Each point in the input maps to Gaussian noise like pattern in the output (unbiased). Unlike the Fourier transform which maps a point to a sin or cos wave which is a biased transform not least because it maps a point to a single frequency. Usually each of the random patterns are orthogonal and then you can invert the random projection.
This gives you the ability to create a distributed representation of your input data.
I mention this because you can then use the idea from deep neural networks of dropout.
RP your input data, randomly zero some elements and then do another RP. The idea is that when you use that as input to a training algorithm it will learn to respond to a region around the given input and not just exactly the input itself. That could help you create a learning algorithm that had a better ability to generalize. However the deep neural net researchers apparently don’t understand how or why their own idea works, so there you go.

Sean_O_Connor · August 15, 2016, 3:38am

So the layers in a deep neural network that has just been randomly initialized are effectively random projections. When an error signal is back propagated it ends up (via summation) as Gaussian noise via the central limit theorem. Presumably that Gaussian noise is driving search.
The question is whether there is some implicit evolutionary algorithm, or a dissapative algorithm operating. I would guess it is a dissapative algorithm with the weight parameters relaxing into a local minimum. There are suggestions in the literature that there are a multitude of local optimum that are still very good in such nets.
Anyway the “dropout” idea should help with things like the Hopfield network as well by helping carve out a response region for things to fall into. What use is a singularity if it doesn’t have a gravitational field to drag in nearby things?

Sean_O_Connor · August 15, 2016, 8:17am

So the idea as to how current deep neural nets learn is that of a Jostle net. The weights are getting knocked around by Gaussian noise via summations (via approximate random projection) of the back propagation errors. When by happenstance the weights end up in good configuration they are less likely to be knocked around further. Thus the system gradually sheds disorder and settles down.

rhyolight · August 15, 2016, 3:34pm

You’re talking about using random projections for encoding data into SDR format, yes?

Sean_O_Connor · August 15, 2016, 4:36pm

You can use random projections to make sparse data for sure.
I dug out a non-assembly language version I wrote from my files :https://drive.google.com/open?id=0BwsgMLjV0Bnhc0ZnN2ZqS3Rua2s
I may still have similar Java code somewhere.

Topic		Replies	Views
Adversarial examples and deep neural networks Lounge	4	709	April 30, 2018
The Platonic Representation Hypothesis Lounge	1	94	November 2, 2024
Why do members here think DL-based methods can't achieve AGI? Numenta Theory question , community	186	6409	April 19, 2024
Non-linearity sharing in deep neural networks (a flaw?) Lounge	3	1116	May 24, 2019
Direct Fit to Nature: an evolutionary perspective on biological and artificial neural networks General Neuroscience	7	876	February 28, 2020

Lack of Biological Correlation: Are these NN Flaws still valid?

Related topics