I think the main application in machine learning is more for efficiency of things like random projections and locality sensitive hashing (LSH.)
And by the way, it is useful to think of LSH’s as random features. Then using a LSH for weight block selection is rather like max-pooling based on random features.
It is rather extraordinary that you can taint random noise and transmit it in a way that you can detect but an adversary cannot. Though maybe it is just detectable through higher order correlations in low external noise situations, according to what Rader found in 1969.
Charles M. Rader was an influential electrical engineer and mathematician at MIT Lincoln Laboratory, best known for his work in digital signal processing.
FFT – He developed Rader’s algorithm (1968), which efficiently computes discrete Fourier transforms when the length is a prime number, reducing computation cost by mapping the problem to a convolution.
Walsh–Hadamard transform – He studied fast algorithms for the Walsh–Hadamard transform and its applications in communications and signal analysis.
Random number generation – He explored methods for producing pseudorandom sequences with good statistical properties, especially for simulations and communications testing.
Talking about “missed serendipity” Rader knew about the Walsh Hadamard transform, he knew about some effects of random sign flipping in relation to that. He (as far as we know) failed to connect up the ideas to create fast Random Projections, which he was only a hair’s breadth away from.