SDR Implimentation of neural network Layers

I’ve been trying to apply Sparse Distributed Representations (SDR) to simple neural network layers. Can anyone suggest a method for achieving this? I want to ensure that only 2-5% of the neurons are active in each layer.

1 Like

assuming you’re doing the classic input->hidden layer(s)->output, you could do something like sort each layer by score, and set the bottom 98% to 0 and the top 2% to 1 before moving onto the next layer. I’ve never thought to do this before, im curious what the result would be.

2 Likes

This would be a form of k-winner take all, which is one approach.

Another approach would be to apply regularization to your loss function. Something like L1 (LASSO) or L0 regularization should encourage the network to use the most sparse activations.

2 Likes

Thankyou Jossos fro your response,
I’m working on a Forward Forward Neural Network for digit classification using the MNIST dataset. The first few layers of the network are trained in an unsupervised manner, and then I apply a classifier to classify the digits after the unsupervised model has been trained. Now, I want to introduce sparsity at each layer of the unsupervised model, aiming for something like 2% of neurons being active, similar to how the brain activates specific neurons for specific tasks. I want the network to behave such that for one digit, certain portions of neurons are active, and for another digit, a different set of neurons is active. You suggested setting 98% of the neurons to zero and leaving 1-2% active, but I have a question. Does setting neurons to zero actually force them to produce outputs close to zero? I’ve tried approaches like weight/neuron pruning, but they don’t seem to compel the neurons to produce outputs near zero. We are forcefully making weights/neurons zero.

1 Like

Thankyou CollinsEM for your response,
I was thinking about it but again we are making weights to zeros using LASSO and then we cannot use them again. Is it possible to make few active at one point and rest nearly zeros and at another point of time some other set of neurons are active and rest are nearly zeros? If there is a way please suggest it? Or we can discuss it here.

1 Like

If you search for “sparse activation arxiv” you’ll find quite a few papers swarming around the idea.

e.g. one paper noticing the hidden layer in transformer’s ff MLP blocks tend to naturally become sparsely activated during training.

Also others attempting to “provoke” sparsity with various means for various reasons.

PS also beware the neurons are not only sparsely activated but also sparsely connected. What I mean is that instead of having a potential for) full connections between a neuron in layer N to all neurons in layer N+1, in practice they can only connect to a significant fewer number of them

1 Like