ReLU is great and all that but you are just zeroing channels based on decisions (x>0?)
You could created interactions between the decisions and use those for weight matrix interaction.
1 Like
If the input vector was x then you would need to decorrelate the +1,-1 entries in D from x or you would get abs() type behavior. A locality sensitive hash of x with +1,-1 output bits is one option, another would be a derangement type permutation of D.
Other interactions between the elements of D could be logical And, Or etc. The road not traveled yet.
1 Like