HTM in OpenAI Gym

Has anyone used OpenAI Gym? I saw a youtube comment saying we should try to get an HTM agent into OpenGym, but I know nothing about it except that it uses TensorFlow or Theano as a platform.

Is anyone interested in getting an NuPIC running in OpenAI Gym?

Hey @rhyolight, I recently starting playing around with OpenAI Gym in the goal of applying HTM to it, but it may be a little early, and I still did not have time to push it very far. We can talk about it if you want. OpenAI Gym does not use any specific platform, and has an extremely simple API.

We can always use classic reinforcement learning algorithms, building policy or value models with HTM systems, but a more HTM-esque version of doing it would be to wait for the advances of our research on sensorimotor inference. Once we have stable code supporting this work, OpenAI Gym would be a great thing to try.

1 Like

As part of our work with OpenAI Gym, @ericlaukien has added some new tasks which you can extend via his PGE library. It’s a drop-in replacement which uses only Open Source components instead of OpenAI Gym’s proprietary use of 3D engines. The guys in OpenAI are currently assessing it for inclusion in future OpenAI environments. We’re looking for feedback, especially if you have examples using HTM technologies.

2 Likes

Reposting this from another post, in case this might be useful to others:

@natoromano ran some experiments with HTM and OpenAI. He presented his results during an engineering meeting and his code is here: https://github.com/natoromano/nupic.research/tree/gym/projects/gym

Cheers!

Marion

1 Like

Hi, I was trying to get @natoromano 's openAi gym example working. I’m having trouble with a call to convertedDepolarizeCells() in extended temporal memory:

RuntimeError: CHECK FAILED: 
"std::all_of( activeCellsExternalBasal, activeCellsExternalBasal + activeCellsExternalBasalSize, [&](UInt c) { return c < numBasalInputs_; })" 
Values in activeCellsExternalBasal must be within the range [0,0).

this is using nupic.bindings==0.4.13. If I use the latest nupic.bindings==0.5.1, then I get a compile error in temporal_memory_factory.py (nupic.research) for the import:

from nupic.bindings.experimental import (
  ExtendedTemporalMemory as ExtendedTemporalMemoryCPP)`

so I’m a little unclear what nupic.bindings should be used to run the latest stuff in nupic.research (I’m using latest nupic of 0.5.7)?

Getting out of the weeds, the Network architecture is an observation sensor, an actionsensor, and SPRegion and an ExtendedTMRegion. The key point is that the action sensor feeds into etm “basal-ly” – so perhaps my question is could some point me to a basic example of how to run the latest nupic.research ExtendedTMRegion ?

Many thanks

Ryan

Perhaps @scott or @mrcslws could find time to help you out, but remember what the research readme says. If that code has been abandoned, there may not be much interest.

If you’re using experimental/research code, you should be running everything in “develop” mode, rather than versioned binary releases. Getting your environment up to speed is a little trickier now, but it can be done!

I made an attempt at documenting the process for setting up your repositories to work with the newly minted nupic.research.core fork of nupic.core at https://github.com/numenta/nupic.research/wiki/nupic.core-feature-branch-workflow. Build nupic.research.core-derived nupic.bindings from source, and install nupic from checkout.

2 Likes

The ETM is now aware of its basal/apical input dimensions, so these need to be set in the TM params.

I see this code doesn’t set the RawSensor’s width, so I think it will use the default of 2048.

So, add to the TM_PARAMS:

"basalInputWidth": 2048,

It would be nice if the region would figure out the width of the basal input rather than requiring you to set this. But last time I tried this I failed – the region couldn’t check the size of the input during initialization. At least partly because it doesn’t actually receive the inputs and outputs during initialization.

Thanks a bunch, Marcus! That helped – I’ve got the cart-pole example running now. (using an old nupic.bindings release, I will have to update that when I figure out how to build the core)

I wonder what performance @natoromano got? I’m seeing scores in the low 30’s on average at best (number of steps the pole was kept upright).

Number of steps at trial 57 is 16
Average of last 100 is 32.81

Number of steps at trial 58 is 47
Average of last 100 is 33.05

FYI, I’ll be sharing the code as I go along if anyone is interested: GitHub - ryanjmccall/htm-agents: Software agents based on HTM theory

1 Like