A Highly Artificial Path to ALife

I eventually put together a fast type of neural network with an external associative memory system.
The AI code is fine I would say, the example problem code shows signs perhaps of reluctance.
Certainly the thing is even further abstracted from the biological brain than conventional artificial neural networks.

The associative memory is based on hash functions.
Hashing produces ‘random’ vectors in higher dimensional space that are approximately orthogonal. It is easy then for the weighted sum (dot product) to associate scalar values with a certain number of those ‘random’ vectors.
Using milder hashing (eg. Locality Sensitive Hashing) further mathematics apples allowing interpolation between the data points stored in higher dimension space that isn’t possible with conventional hashing.
The variance equation for linear combinations of random variables and the central limit theorem then play a role.

The fast neural network I used is a swap around between the variable weights and a fixed activation function used in conventional artificial neural networks.
Instead the weights are fixed by using a fast transform for them (fixed filter bank) and the activation functions are made individually adjustable (parameterized.)

1 Like

Here is a paper that uses a Butterfly transform, which has a similar structure to a Fast Fourier Transform (FFT) or WHT:
Basically you can have a fully connected network with only 2.n.log2(n) weights per layer. And 2.log2(n) weights per non-linear activation function. Compared to n.n weights and n weights in a conventional layer.
I think it would be impossible to have something so highly structured in a biological system, however random projections are a reasonable substitute for such structured transforms and possible within biological systems.

1 Like

A simple visual example of a Fast Transform Neural Network:

You can argue it is a bit mechanical, but then too, a small conventional ReLU net is similarly mechanical. And the process of switching dot products at the zero mark is common to both.

There is a conversation going on at Konduit about some of the problems conventional artificial neural networks have:
In a conventional net there are n weights per nonlinear function each leading to a different dimension. Unless there are substantial linear correlations wanted across all those dimensions that is a big waste, and probably accounts for the effectiveness of pruning that has been observed in such networks.
Things need not be that way, using random projections each weight could lead back to an individual nonlinear function.