I haven’t done any programming in a while and would like to start a project to get rid of some of my rust. Also, while doing this project I’d like to do some philosophizing and hopefully have some enlightening conversation.
Has some tools setup already, for Reinforcement Learning in Minecraft,
GitHub - kaikun213/my-neural-net: Initial implementation of a combination from HTM and RL for a Software Agent in NUPIC., by Kaikun
@kaikun, has a nice GitHub page already set up to help with initial foundations.
What is the best resource for learning to code a Spatial Pooler in Python, and or a full tutorial with a lot of context?
Here is some philosophizing.
The second is, on the level of the individual neuron in an HTM system, is state change being directed downward toward it by the spatial, and temporal algorithms? I was reading something the other day and it made me wonder if it were possible to invert the algorithm such that the effect of the spatial and temporal poolers happen based on the emergent neuron properties. I was skimming “https://arxiv.org/pdf/1609.03543.pdf”. The math is above me, but the metaphor they used about stock trading piqued my interest. Instead of money in the stock trading example as the reward, what if dopamine is the reward for neurons? The neurons exist as “traders” constantly fighting to get a limited supply of dopamine. As a neuron gets dopamine, it makes it so the neurons that it is connected to that didn’t fire with are less likely to get dopamine in the future. However, when neurons fire together they both are getting dopamine from this “relationship”. This is mutually beneficial to both neurons because neuron1 “knows” that at least neuron2 getting dopamine has fired. The mini-column in this case, I’m guessing would act as the “Trading Firm” in the paper, and try to maximize as much dopamine. This wouldn’t necessarily be done on the level of the mini-column per se. It would be a consequence of a grouping of neurons constantly “working together to acquire more dopamine”. The “market maker” in this case would be the thalamus that sets the “price” of dopamine because of its limited dopamine supply. This limitation of dopamine and the thalamus constantly fixing the “price” is where the sparsity in the network could emerge. One thing I kind of like about this metaphor is that it gets rid of the idea that neurons are making predictions. In this metaphor, the cause of a neuron firing isn’t really because it’s predicting some external state, but some external stimuli caused sensory transduction and that transduction caused these first layer neurons to fire because of the potential dopamine reward. Personally, I think prediction is a higher-level abstraction and we use it to anthropomorphize what these networks are doing. This metaphor may also be useful for the voting problem in reference frames. The only model a neuron would need is one in which it maximizes its own dopamine. However, from an external observer of that neuron because of the network effects, it appears like it “knows” about the external world and is doing complex things with that data. Neurons wouldn’t necessarily be voting on their understanding of the external world per se. Collections of connected neurons, “know” that by “working” “together” they’ll get more “dopamine”, from the point of view of an observer on the network effect there is an appearance of the individual neurons knowing more. For this scheme, a neuron’s only function would be to increase connection strength to already firing neurons. A potential explanation might even be that any connection to a neuron that is firing actively gives a small amount. of dopamine. So, the model might be; Neuron at any given time step gains small dopamine from all its connections that fired, if it meets a threshold it also fires and loses its dopamine. It then increases its connections to all neurons that fired, all other connections are automatically weakened from lack of use.
The next question will be about the enteric nervous system.