Deep neural network trained only by hill climbing

https://youtu.be/RzNVyzXAtzU

What are we looking at here?

A 5 layer deep neural net with 54*1024 weight parameters trained in a way so simple that likely no one has bothered attempting it in decades. I listed the papers that indicate the possibility here: https://groups.google.com/forum/#!topic/artificial-general-intelligence/TVvry50HEwI

Essentially if the net as enough dimensions ( and presumably some other technical requirements) there is always some direction (to an extremely high probability) you can move the weight values in to get an improvement. The weight space is simply so huge that the probability of being blocked is negligible. That’s my reading of the situation anyway. So far it seems to work in practice, perhaps surprisingly well.
I think if you were to try it in an entirely binary system you could hit some road blocks. For example if changing one bit caused an avalanche altering 50% of the other bits I don’t think that would work.

I think it is interesting in a more general sense because it suggests that random mutation alone can forge complex systems as long there are enough dimension to maneuver in (and plenty of time.)

So far the net I am training continues to improve, but I can’t say that the method won’t just trail off after a while. It’s a bit early to know.

One way to think of it is this. That the solution contains n bits of information. Each hill climbing decision to keep or reject a mutation can supply at most 1 bit of information, likely much smaller (say 0.001 bits). Nevertheless n/0.001 is still a finite number of steps.