Anshuman Mishra talks about algorithmic speedups via locality sensitive hashing and reviews papers on bio-inspired hashing, specifically LSH inspired by fruit flies.
He first gives an overview of what algorithmic speedups are, why they are useful and how we can use them. He then dives into a specific technique called locality sensitive hashing (LSH) and goes over the motivations of using these types of hash algorithms and how they work. Lastly, Anshuman talks about the potential biological relevances of these hash mechanisms. He looks at the paper “A neural algorithm for a fundamental computing problem” which outlined a version of LSH inspired by fruit flies that uses sparse projections, expands dimensionality and uses a Winner-Takes-All mechanism.
Robin Hood hash-tables waste less memory. You can fill them up to +90% and still get good look-up times. The insert only version is very easy to do. Deleting is more difficult. The best way is to maintain a histogram of the displacements involved and do some complicated unwinding steps.
It’s not too difficult though.
The main point of a random projection is that it spreads out the input information “evenly” across all the output. A change in a single input value affects all the output values. It can’t be too even though or you would just have a structured transform like the FFT, so you use random values in the projection matrix. Since there are an almost infinite number of random matrices you can have there are an almost infinite number of different windows on the input information you can have.
You can look through one window and then through another and that gives you a dimensional increase in the ways of seeing.
Or since the input information has been spread out you can sub-sample a random projection and still have an approximate idea of the entirety of the input. You can use such sub-samples for approximate comparison.
You can binarize the output of a random projection to get a locality sensitive hash (LSH.) Which entails some information loss but makes comparison easy.
If you sub-sample from a full size LSH, similar inputs will hash to the same binary pattern or only be 1 or 2 bits out. You have no control over the decision boundaries though… nevertheless…
In the dimension increase you end up with a large number of independent non-linear terms relating to the input. Non-linear because binarization is a non-linearity. You can weighted sum those non-linear terms and by adjusting the weights arrange to learn different responses to different inputs.
The fast Walsh Hadamard transform (WHT) costs nlog2(n) add subtract operations. The add and subtracts being done in special patterns.
If a special chip was constructed with parallel and pipelined operations the effective time cost could be gotten to O(1).
The WHT equates to a structured matrix H. Which needs to be destructured to get a random projection. This can be done using a diagonal matrix D with random +1,-1 entries. A quick random projection for a non-sparse large vector such as image data can then be as simple as HD. For sparse input data or if you want better quality you can use HDHD. You can also use a random permutation matrix P. Since H,D and P are invertible the resulting random projections are invertible, or if you sub-sample partially invertible.
Compressive sensing techniques can allow full inversion in the sub-sample case for sparse data.
HDHD… equates to a matrix with binomial entries tending toward a matrix with Gaussian distributed entries.