Random Projection (aka RCA) helps for reducing dimensions. It even can be used for increase dimensions (like kernel function). It’s reported it still have good performance for following ML algorithms.
Since brain is pretty random (different people has different brain, ), I’m wondering whether brain naturally has random projection machanism? E.x. few neurons regions always have certain connections, no matter how data changes. It will ‘compress’ the data into lower dimensions, or transform the data into higher dimensions.
Question
(1) It has such region/neurons, right?
(2) Reduce dimension or increase dimension? I guess it’s reducing dimensions.
(3) I believe it can’t be changed in short term (e.x. daily or weekly). But, does it change yearly or in life time?
I’m not a neuroscientist but just follow Jeff and Numenta for years. Your guys’ input would be helpful.
I suppose you can view the brain as a random 3D crossbar with each neuron having a set of made connections and a set of potential but unmade connections. And in a sense the union of those sets is some form of random projection. I’m not sure how much of the math from linear random projections you can apply though. Anyway you can use the idea as another concept to help reason about the brain.
Whatever way you construct neural nets the recent papers showing no local minimums only saddle points tells you something very important. That simplified simulated annealing is sufficient unto the day. You can say that simulated annealing with no need of upward steps will do or you can say that random hill climbing with a cooling rate will do. When new data comes in some random accommodation to that new data is fine so long as the amount of accommodation reduces over the longer term (as you get old.)
By random projections, do you mean random connections forming between hubs , or a more directed formation of projections, like the association/dissociation of neuronal assemblies (Gerstein, G. L., Bedenbaugh, P., and Aertsen, M. H. 1989. Neuronal assemblies. IEEE Trans. Biomed. Eng. 36, 4–14) - ?
Ed Bullmore wrote a fascinating paper using graph theory to describe this phenomenon (“Small World Brain Networks”, Basset , A. and Bullmore, E., Neuroscientist, 12, 6, 2006).
IMHO, random connection and pruning processes came earlier in evolutionary history, the process becoming more directed with the emergence of more complex and hierarchical brains, directed by the goal of increasing fitness for the organism.
Yeh, I don’t know what happened to that link. Anyway I can point you to code if you are just looking for a fast algorithm:https://github.com/S6Regen/EvoNet
A random projection (RP) is just a dot product between the input data and a randomly chosen vector. In higher dimensions random vectors are approximately orthogonal. That means if you sample an input vector with a reasonable number of random projections you don’t lose much information.
There was a paper recently showing random projection in an insect brain. In that case the RP wasn’t used for dimensional reduction, rather to increase the number of dimensions to create a micro - extreme learning machine. http://fastml.com/extreme-learning-machines/
This is one of my pet questions. It has been shown in natural language processing (NLP) that random projection can replace singular value decomposition (SVD) in the method of latent semantic analysis (LSA) with similar performance results (plus gaining on-line capabilities). Both are dimensionality reduction techniques. The authors of LSA showed that dimensionality reduction actually helped improve test results on word synonym tests and they even hypothesized that this is key to how the mind works. It’s a great paper [1].
In fact the man who develped the foundation of the SDR theory that Numenta uses, Pentti Kanerva, is also one of the names behind this result of substituting random projection for SVD in LSA.
It always seemed to me that cognition works by stereotyping. It is by reducing the flux of information to categories, to associate (similar) things with each other and rendering them under the same token, that reasoning can take place. I’ve sometimes wondered whether the success of the ReLU is precisely due to its deadpanning effect.
[1] Landauer & Dumais, 1997, “A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge”
This research team have fast code for random projections and also a paper: https://github.com/FALCONN-LIB/FFHT https://arxiv.org/abs/1509.02897
Really everyone should use GPU’s to implement RP’s. However it looks like AMD’s new 32 core CPU’s will get you within say 1/4 of a GPU.
I combine them with evolution leading to this idea: https://groups.google.com/forum/#!topic/artificial-general-intelligence/4aKEE0gGGoA
If that is true, and it surely would take some high level math to prove it, then evolution for bacteria and viruses is really a blind watchmaker. For diploids the watchmaker can sense light from dark and maybe make out broad shapes, like if a recessive gene is expressed even slightly that results in an informational clue. And for carefully constructed digit systems the watchmaker is a cyclops, with mono-vision (using a highly informational single summary measure returned by a cost function.)