Statistical Channels

Each output dimension of a linear fast transform is a different dot product of the input vector. Each then is a different filter result on or statistic of the input vector. Applying parametric functions to those outputs allows you to nudge the statistics produced by a further fast transform this way and that. The pattern is statistics, nudges based on decision making, statistics, nudges…
That is a fast transform neural network with preordained statistical channels.
With a conventional artificial neural networks things get mixed together. The statistical channels emerge during training and are mixed in with the amount of statistic nudging and — separately ReLU or whatever does the decision making.
You need to understand the statistics of the dot product for that comment to be helpful I suppose.