Geoffrey Hinton explains sparse coding and drop-out


I though the comparison between the effects of drop-out and sparse coding given in the video was very interesting. Namely sparse coding is not really being used for feature discrimination, rather it is used to force ensemble learning and generalization by breaking up unhelpful dependencies:


With sparse coding you can never be sure a particular feature dimension (neuron) will be active when classifying an input. Hence the system cannot over fit by building up a dependency on that one dimension. I’m sure there are more subtle aspects as well.