A testcase of HTM with reinforcement learning

asterocean · June 14, 2016, 9:21am

NuPIC implemented a model that mimic the core behaviour brain respond to universal inputs.

presume that all things well done, then we got a system which could respond to sensor input, yet passive, for the system lack it’s own motive.

pains and desires are the compelling force which droves livings to act.

so i build a testcase called snake game which introduced pain & desire in the system, hope it could help the snake adjust its act to get desire fulfilled and least pain.

in the game, the snake move around in an square splited into grids , within which food emerges randomly.if the snake catch the food, it’s desire score goes high, and if hit the wall, it feel pain. the vision, pain, desire, and direction control could be easily encoded into SDRs.

under this prediction, the system should evolve the snake to a food hunter, which expound a complete machine intelligence who would adapt to environment and avoid danger and pursue it’s own desire.

i’ve been work around with NuPIC for several days, still couldn’t figure out how to endow the snake to act based on the vision pain and desire it sensed, any suggestions?

Bellows are python scripts to illustrate the idea.

https://pypi.python.org/pypi/snake_nupic

scott · June 14, 2016, 4:07pm

The problem that you’re describing is a sensorimotor and goal-oriented behavior problem.

Sensorimotor means that the snake must model changes that it observes that are caused by its own actions, as opposed to external changes that it passively observes.

Goal-oriented behavior means the snake must additionally drive motor commands in pursuit of some goals. In your case, there are positive and negative signals and the snake must learn to drive sequences of motor commands to get to the positive rewards.

NuPIC does not currently have models for sensorimotor or goal-oriented behavior. Numenta and others are actively working on advancing HTM theory and extending our understanding to include these components of cortical function.

If you don’t want to wait do theory advanced that could take years, there are some ways to set up models to solve the snake problem. You can try using the experimental sensorimotor model in nupic.research:
https://github.com/numenta/nupic.research/blob/master/htmresearch/algorithms/TM_SM.py

We can’t provide support for that code and we will be changing it substantially in the future but you can try it out. You can search that repository for example code that uses it. The idea would be to learn a sensorimotor representation of your grid world and then when you want to drive behavior you can simulate different motor sequences and look at the predictions.

floybix · June 15, 2016, 12:31am

Sounds like a worthy experiment.

Here are some other test problems for reinforcement learning: https://gym.openai.com/

btw, I changed the topic title from “A testcase of HTM theory”.

Fraz_J · June 19, 2016, 11:57pm

A lot of the results in deep learning neural networks are the result of empiricism. When you read the papers by top researchers the mathematics often has little or nothing to do with the systems they have built up by trial and error (and guessing). There are 2 strands there that actually don’t meet. A more focused and sensible form of empiricism is called engineering. That is the middle way. Also since engineering components are well understood anything you build with them you can explain and rationally defend.

Topic		Replies	Views
Reinforcment learning using NuPIC(.core) Engineering	3	1060	April 17, 2018
Embodied HTM with Raspberry Pi Sensory-Motor Loop Engineering	5	1040	August 28, 2018
NUPIC and robotics NuPIC	3	1067	January 30, 2018
Help with an HTM implementation NuPIC help-wanted	5	1182	November 9, 2017
Proposing a Model for the Basal Ganglia and Reinforcement Learning in HTM Tangential Theories theory , basal-ganglia , reinforcement	16	2889	August 12, 2017

A testcase of HTM with reinforcement learning

Related topics