ElixerHTM: new HTM implementation

Working on it. Hoping to have rough framework done by end of week.

Edit from @rhyolight: Moved to a new topic.

3 Likes

there’s a remnant here: https://github.com/fergalbyrne/cort.ex probably not that relevant for today’s HTM if I’m not mistaken

1 Like

I might be able to use some things there (I see the lot of detail into making it distributed, a lot of focus on specific neuron spiking/firing). Thanks for sharing :slight_smile:

1 Like

I wanted to do this for a while. Just made a spike. Using Erl_Interface and Pony’s C FFI it’s possible to create an Erlang C node. Current state: the API works “in principle”, which means, if the clustered Elixir thing takes off, it’d be potentially possible to offload computationally expensive work to a natively compiled node, e.g. in C. However, writing an Erlang C node even with the API + FFI is indeed a big pain.

A similar approach is already potentially possible now e.g. using htm.java wrapped into JInterface or Encon Java, but I think, the scope of an Elixir or Pony implementations is a bit different one

@MaxLee any progress? I’ve rebooted the Pony-Erlang C Node project, which can now receive messages from other nodes. As for the HTM in Pony I’ll have to rethink error-handling, and if I get some time, will continue with it

3 Likes

I have the prototype of a scalar encoding and spatial pooler with distributed SP via gensim, though haven’t been able to work (on this) for about a month. I’m winding down one of my contracts, and should be able to put more time into this over the next few weeks, aiming at a TM implementation by the end of two weeks.

Got big plans for this as I move more deeply into robotics over the next year, where each node can do its own encoding before sending it off to the cluster for computation and feedback.

3 Likes

Superb. Is it public already by any chance?

1 Like

Let me prettify it a bit and remove dead ideas, then I’ll update it on github. :slight_smile:

Update (Thursday, March 21, 2019):
Life and its demands has continued to delay my doing this, but I AM pushing hard to get this out this weekend.

4 Likes

Reviving this thread from the dead.

Had some time over the past couple of weeks (thank God for this damned plague???), and have my initial implementation of HM (with TM coming in the next couple days) running in a distributed/concurrent manner in Elixir.

Each column acts as its own process/entity, and currently coordinates with a “PoolManager”. The PoolManager is responsible for broadcasting the SDRs out to the columns, picking the winners for TM, and letting them know to update their strengths.

With 10k columns, each turn currently takes ~32ms. It’s designed (at the moment) as a poor man’s web API, taking in only get requests and parsing the URI. I’d like to clean that up so that it takes and returns JSON responses instead, but this initial push was more to get the thing up and going, rather than aiming for perfection.

Better something than nothing, and now that the ball’s rolling, it should be easier to keep it moving.

2 Likes

@MaxLee Took at look at your repo. Looks cool. How do you distribute input across the columns?

Sending in a request to the http_server at /SDR/ (the endpoint is seen in /lib/htm/handler.ex), passes that to PoolManager.
image

PoolManager then converts it to a list of bits (in /lib/htm/pool_manager). That SDR pattern is then broadcast to the columns (using the message {:check_sdr, sdr}.
image
which calls…
image
which finally calls…
image

The column processes (/lib/htm/columns.ex) then independently check their connection map (initialized at column startup) against the bit pattern, reporting back to the pool manager their overlap score.

The PoolManager increments a response counter each time it receives this message (which also keeps track of the average score). When the PoolManager has heard back from all the columns, it takes the average overlap score, filters out any columns below that score, sorts it in descending order, and chooses up to “total_number_columns * 0.02” winning columns, broadcasting the “:strengthen_connections” message.


which casts…
image

After opening iex, you can look at the VM stats and details by typing “:observer.start”, which will then pop up a GUI. On the ‘Applications’ tab, our application “elixirhtm” looks like this
image

Not seen are the 10k columns which are sprouted when the pool is started.

2 Likes

wow, @MaxLee, congrats on the progress! I’ll take a look at it too.

As for my side projects: recently, I’ve experimented with libcluster and horde for distribution and service discovery. Distribution is not easy to get completely right, but with these tools it’s possible. Given the wetware nature of the brain and the robustness of SDRs, perhaps, perfect consistency of a cluster is not required.

For those unfamiliar with elixir/erlang: message sending can be transparently uniform (same API), be it within one VM or distributed across networked VMs (nodes)

1 Like

I generally don’t marry myself to any particular tech until all other options have broken… even then, I’m not permanently committed if something better comes along, so any experience you have or gain with either libcluster or horde (or anything else that is floating out there), I’m game.

Over the weekend added basic TM functionality, growing proximal connections. Each winning column at the end of the turn reports back to the pool which column:cell combo won the round. At the next round when winning columns are chosen, those previously winning column:cell combos are sent to the winners so that as they update their distal strengths, winning cells in the column can message the previously winning column:cells, requesting that whenever those column:cells win again, to send messages to our current winning column:cells to put them into a potentially predictive state. So “T - 1’s” winners message " T’s " potential winners to set them into a potentially predictive state.

So at the moment, the basic HTM mechanics are all implemented. Next efforts will be on improving IO into the API so that it sends back more meaningful, fleshed out data, thinking of storing current state SDR using a bitmap canvas (similar to the “canvas_paint” demo after you type " :wx.demo " into iex. Then this bitmap could be returned when someone is requesting the “/pool/state/” endpoint… basically, trying to overcome the “it works great, you just can’t see it” problem that backends face.

2 Likes

Update: Have been working on getting JSON api in place, so soon, there will be no more need for raw URI calls.

Also on the list of things to add in the next 2 weeks:

  1. Completion Callbacks
    ** When pool finishes processing an input, send output SDR to a selected API endpoint (for collection/concatenation)

  2. Pool groups
    ** When creating a pool, assign it a group_id, and choose whether to make it away of other pools which share the same group ID or not. Enabling awareness would allow the sharing of connections between winning columns of disparate pools. Intended use would be if one pool is processing FFT information (noise, visual) while another is processing sensor information (heat, IMU, etc.) in the same timestep, that cross-sensor data patterns could be discovered.

  3. More efficient datatypes.
    ** At the moment, connection maps are being stored as lists; will instead move towards using bitmaps, so that more processes (the basic unit of execution) can fit into memory.

  4. Easy (relatively) cross network (LAN) connections.
    ** Have initially test sharing work between networked nodes, but this isn’t as easy as it could be yet. Looking to make it easier without requiring any extra dependency.

I’m looking for more input for features that people want to be included when initializing a pool. Let me know, let’s make a list, and I’ll add it in.

3 Likes

Update:

Made a few minor structural changes in the code over the last few days, which will have large implications for the creation of pool groups.

Rather than have all the winner trackings of columns be stored in the PoolManager, I broke it off into an independent process. This way, each pool in a group can report its winners to a single WinnersTracker, so that during the process of selecting winning cells in the minicolumn, a pool can decide whether or not its columns can subscribe to the winners for other pools in the group.

The practical effect of the above is that we can model the creation of different pools in a group, where some pools can have their columns be temporally aware of winners in other pools (where those other pools may be processing some other type of input in the same time step), while some pools can continue to be isolated islands to themselves within the whole group of pools. Through this, we can begin to model different layers within columns/minicolumns, where some share connections, but not all.

The logical hierarchy then looks something like:

IO encoder (from python, C, or other external source) ==>

API handler (server) ==>

WinnersTracker ==>

PoolManager (can have multiple of these in a group) ==>

Columns (for each pool).

Along the way, fixed a few bugs with state failing to update as expected during winner selection.

All in all, code is now in a state which will allow me to finish making the input/output more friendly, and the overall use more approachable for experimentation by others.

Also still experimenting with distributing pools/columns across nodes on a network. If anyone here has a pi cluster (of any variety of pi), I’d love to have some testers of processing distribution and performance.

1 Like

Hi all (anyone who happens to read this)!

I’ve been continuing to make progress on this, to the point where running it in distributed operation has been letting me find and quash various bugs (such as an interesting problem where the columns would simply stop learning after 25 turns :hot_face:!)

I’m announcing my intention to put this out there as a platform at the end of June, so that people can freely sign-up (this would definitely start out as something not requiring payment), get a key, then send in your graph of pool configuration (JSON format), and have HTM as a service, shooting your data in, getting SDRs (spatial and predictive temporal) and anomaly score out. It will continue to be open source, but my main intent is to provide an easy onramp into HTM… too frequently on the forum here, people want to use HTM and apply it, but don’t have a background in programming.

While it will always be beneficial to know programming to advance HTM design and research, I believe the programming barrier is too highly artificial, and is preventing us, as a community, from on-boarding people who may have new insights and ideas to contribute, or perhaps point out use cases that we, in our sincere naivete, may simply not have known about.

Assuming that the community does indeed find this useful (for me, the monthly costs to run this, for servers anyway, would be ~$50 for a cluster of 10 servers though that wouldn’t include any of my time that will undoubtedly go into supporting it), I’ll probably transition it into a proper entity, set up proper agreements with Numenta, and expand it as an inexpensive, easy approach to start using HTM in different organizations, or even just for people who are too lazy or don’t want to bother with setting it up at home. At the same time, keeping everything open source will continue to allow those who want to, to still have that option.

Anyway, I just want to put this out here so that folks understand that I’m serious about making this work easily for all of us. Biological approaches will be the future, and HTM will undoubtedly be a part of that in some way.

1 Like