Idealistic Approach to Deep Learning Paper

I describe an Artificial Neural Network, where we have mapped words to individual neurons instead of having them as variables to be fed into a network. The process of changing training cases will be equivalent to a Dropout procedure where we replace some (or all) of the words/neurons in the previous training case with new ones. Each neuron/word then takes in as input, all the b weights of the other neurons, and weights them all with its personal a weight. To learn this network uses the back - propagation algorithm after calculating an error from the output of an output neuron that will be a traditional neuron. This network then has a unique topology and functions with no inputs. We will use coordinate gradient decent to learn where we alternate between training the a weights of the words and the b weights. The Idealistic Neural Network, is an extremely shallow network that can represent non-linearity complexity in a linear outfit.

There are two different papers there, if you would like to read both of them.

1 Like


I’m in the process of finishing off a chatbot with a large corpus of Q/A, and have found wide and shallow to work a lot better than deeper networks at generalizing the most appropriate response. I have an intuition that having an input layer the size of my vocabulary might be creating something similar to a ‘one neuron’ per word setup.

I can’t share much more about it, as it’s for a client, but the concept seems to be playing out in front of me.