I created this document.
https://archive.org/details/2-pathway-re-lu-big-picture
It is says a number of things.
An ordinary ReLU neural network breaks down the input space into a number of piece-wise linear regions. However a change in the input within one region only produces a 1 dimensional wiggle in the output.
With 2 pathway ReLU (responds to both positive and negative activations, with a different output being active for each) a change in the input within a region can produce a change in the output in any direction.
That also means information is more preserved as it flows through the neural network. It isn’t just reduced to a bunch of 1 dimension responses within the neural network.
Also there hasn’t been clear thinking about the first and last weight matrices in neural networks. It is really not good if they end up being low rank. And ought they not be replaced by some fast transform arrangement?