ElixerHTM: new HTM implementation

Working on it. Hoping to have rough framework done by end of week.

Edit from @rhyolight: Moved to a new topic.

3 Likes

thereā€™s a remnant here: https://github.com/fergalbyrne/cort.ex probably not that relevant for todayā€™s HTM if Iā€™m not mistaken

1 Like

I might be able to use some things there (I see the lot of detail into making it distributed, a lot of focus on specific neuron spiking/firing). Thanks for sharing :slight_smile:

1 Like

I wanted to do this for a while. Just made a spike. Using Erl_Interface and Ponyā€™s C FFI itā€™s possible to create an Erlang C node. Current state: the API works ā€œin principleā€, which means, if the clustered Elixir thing takes off, itā€™d be potentially possible to offload computationally expensive work to a natively compiled node, e.g. in C. However, writing an Erlang C node even with the API + FFI is indeed a big pain.

A similar approach is already potentially possible now e.g. using htm.java wrapped into JInterface or Encon Java, but I think, the scope of an Elixir or Pony implementations is a bit different one

@MaxLee any progress? Iā€™ve rebooted the Pony-Erlang C Node project, which can now receive messages from other nodes. As for the HTM in Pony Iā€™ll have to rethink error-handling, and if I get some time, will continue with it

3 Likes

I have the prototype of a scalar encoding and spatial pooler with distributed SP via gensim, though havenā€™t been able to work (on this) for about a month. Iā€™m winding down one of my contracts, and should be able to put more time into this over the next few weeks, aiming at a TM implementation by the end of two weeks.

Got big plans for this as I move more deeply into robotics over the next year, where each node can do its own encoding before sending it off to the cluster for computation and feedback.

3 Likes

Superb. Is it public already by any chance?

1 Like

Let me prettify it a bit and remove dead ideas, then Iā€™ll update it on github. :slight_smile:

Update (Thursday, March 21, 2019):
Life and its demands has continued to delay my doing this, but I AM pushing hard to get this out this weekend.

4 Likes

Reviving this thread from the dead.

Had some time over the past couple of weeks (thank God for this damned plague???), and have my initial implementation of HM (with TM coming in the next couple days) running in a distributed/concurrent manner in Elixir.

Each column acts as its own process/entity, and currently coordinates with a ā€œPoolManagerā€. The PoolManager is responsible for broadcasting the SDRs out to the columns, picking the winners for TM, and letting them know to update their strengths.

With 10k columns, each turn currently takes ~32ms. Itā€™s designed (at the moment) as a poor manā€™s web API, taking in only get requests and parsing the URI. Iā€™d like to clean that up so that it takes and returns JSON responses instead, but this initial push was more to get the thing up and going, rather than aiming for perfection.

Better something than nothing, and now that the ballā€™s rolling, it should be easier to keep it moving.

3 Likes

@MaxLee Took at look at your repo. Looks cool. How do you distribute input across the columns?

Sending in a request to the http_server at /SDR/ (the endpoint is seen in /lib/htm/handler.ex), passes that to PoolManager.
image

PoolManager then converts it to a list of bits (in /lib/htm/pool_manager). That SDR pattern is then broadcast to the columns (using the message {:check_sdr, sdr}.
image
which callsā€¦
image
which finally callsā€¦
image

The column processes (/lib/htm/columns.ex) then independently check their connection map (initialized at column startup) against the bit pattern, reporting back to the pool manager their overlap score.

The PoolManager increments a response counter each time it receives this message (which also keeps track of the average score). When the PoolManager has heard back from all the columns, it takes the average overlap score, filters out any columns below that score, sorts it in descending order, and chooses up to ā€œtotal_number_columns * 0.02ā€ winning columns, broadcasting the ā€œ:strengthen_connectionsā€ message.


which castsā€¦
image

After opening iex, you can look at the VM stats and details by typing ā€œ:observer.startā€, which will then pop up a GUI. On the ā€˜Applicationsā€™ tab, our application ā€œelixirhtmā€ looks like this
image

Not seen are the 10k columns which are sprouted when the pool is started.

2 Likes

wow, @MaxLee, congrats on the progress! Iā€™ll take a look at it too.

As for my side projects: recently, Iā€™ve experimented with libcluster and horde for distribution and service discovery. Distribution is not easy to get completely right, but with these tools itā€™s possible. Given the wetware nature of the brain and the robustness of SDRs, perhaps, perfect consistency of a cluster is not required.

For those unfamiliar with elixir/erlang: message sending can be transparently uniform (same API), be it within one VM or distributed across networked VMs (nodes)

1 Like

I generally donā€™t marry myself to any particular tech until all other options have brokenā€¦ even then, Iā€™m not permanently committed if something better comes along, so any experience you have or gain with either libcluster or horde (or anything else that is floating out there), Iā€™m game.

Over the weekend added basic TM functionality, growing proximal connections. Each winning column at the end of the turn reports back to the pool which column:cell combo won the round. At the next round when winning columns are chosen, those previously winning column:cell combos are sent to the winners so that as they update their distal strengths, winning cells in the column can message the previously winning column:cells, requesting that whenever those column:cells win again, to send messages to our current winning column:cells to put them into a potentially predictive state. So ā€œT - 1ā€™sā€ winners message " Tā€™s " potential winners to set them into a potentially predictive state.

So at the moment, the basic HTM mechanics are all implemented. Next efforts will be on improving IO into the API so that it sends back more meaningful, fleshed out data, thinking of storing current state SDR using a bitmap canvas (similar to the ā€œcanvas_paintā€ demo after you type " :wx.demo " into iex. Then this bitmap could be returned when someone is requesting the ā€œ/pool/state/ā€ endpointā€¦ basically, trying to overcome the ā€œit works great, you just canā€™t see itā€ problem that backends face.

2 Likes

Update: Have been working on getting JSON api in place, so soon, there will be no more need for raw URI calls.

Also on the list of things to add in the next 2 weeks:

  1. Completion Callbacks
    ** When pool finishes processing an input, send output SDR to a selected API endpoint (for collection/concatenation)

  2. Pool groups
    ** When creating a pool, assign it a group_id, and choose whether to make it away of other pools which share the same group ID or not. Enabling awareness would allow the sharing of connections between winning columns of disparate pools. Intended use would be if one pool is processing FFT information (noise, visual) while another is processing sensor information (heat, IMU, etc.) in the same timestep, that cross-sensor data patterns could be discovered.

  3. More efficient datatypes.
    ** At the moment, connection maps are being stored as lists; will instead move towards using bitmaps, so that more processes (the basic unit of execution) can fit into memory.

  4. Easy (relatively) cross network (LAN) connections.
    ** Have initially test sharing work between networked nodes, but this isnā€™t as easy as it could be yet. Looking to make it easier without requiring any extra dependency.

Iā€™m looking for more input for features that people want to be included when initializing a pool. Let me know, letā€™s make a list, and Iā€™ll add it in.

3 Likes

Update:

Made a few minor structural changes in the code over the last few days, which will have large implications for the creation of pool groups.

Rather than have all the winner trackings of columns be stored in the PoolManager, I broke it off into an independent process. This way, each pool in a group can report its winners to a single WinnersTracker, so that during the process of selecting winning cells in the minicolumn, a pool can decide whether or not its columns can subscribe to the winners for other pools in the group.

The practical effect of the above is that we can model the creation of different pools in a group, where some pools can have their columns be temporally aware of winners in other pools (where those other pools may be processing some other type of input in the same time step), while some pools can continue to be isolated islands to themselves within the whole group of pools. Through this, we can begin to model different layers within columns/minicolumns, where some share connections, but not all.

The logical hierarchy then looks something like:

IO encoder (from python, C, or other external source) ==>

API handler (server) ==>

WinnersTracker ==>

PoolManager (can have multiple of these in a group) ==>

Columns (for each pool).

Along the way, fixed a few bugs with state failing to update as expected during winner selection.

All in all, code is now in a state which will allow me to finish making the input/output more friendly, and the overall use more approachable for experimentation by others.

Also still experimenting with distributing pools/columns across nodes on a network. If anyone here has a pi cluster (of any variety of pi), Iā€™d love to have some testers of processing distribution and performance.

1 Like

Hi all (anyone who happens to read this)!

Iā€™ve been continuing to make progress on this, to the point where running it in distributed operation has been letting me find and quash various bugs (such as an interesting problem where the columns would simply stop learning after 25 turns :hot_face:!)

Iā€™m announcing my intention to put this out there as a platform at the end of June, so that people can freely sign-up (this would definitely start out as something not requiring payment), get a key, then send in your graph of pool configuration (JSON format), and have HTM as a service, shooting your data in, getting SDRs (spatial and predictive temporal) and anomaly score out. It will continue to be open source, but my main intent is to provide an easy onramp into HTMā€¦ too frequently on the forum here, people want to use HTM and apply it, but donā€™t have a background in programming.

While it will always be beneficial to know programming to advance HTM design and research, I believe the programming barrier is too highly artificial, and is preventing us, as a community, from on-boarding people who may have new insights and ideas to contribute, or perhaps point out use cases that we, in our sincere naivete, may simply not have known about.

Assuming that the community does indeed find this useful (for me, the monthly costs to run this, for servers anyway, would be ~$50 for a cluster of 10 servers though that wouldnā€™t include any of my time that will undoubtedly go into supporting it), Iā€™ll probably transition it into a proper entity, set up proper agreements with Numenta, and expand it as an inexpensive, easy approach to start using HTM in different organizations, or even just for people who are too lazy or donā€™t want to bother with setting it up at home. At the same time, keeping everything open source will continue to allow those who want to, to still have that option.

Anyway, I just want to put this out here so that folks understand that Iā€™m serious about making this work easily for all of us. Biological approaches will be the future, and HTM will undoubtedly be a part of that in some way.

2 Likes

Despite everything going on in the world, Iā€™ve been fortunate enough to keep well and have been constantly busy with work. But going into the next couple of months, my other obligations are going to be slowing down, and Iā€™m moving forward with this.

The main issue with an Elixir-based approach is that processing speed is something to be desired. To that end, what Iā€™m working on at the moment, at the behest and urging of @jacobeverist, is to create native function interfaces between the Elixir code (which would manage the distribution and IPC code), and native C/C++ code. I have some early iteration which is currently unintelligible to anyone but myself, so aiming at BrainBlocks as a starting point, Iā€™ll be pushing that cleaned up code in the next several days. Iā€™m trying to make its requirements as minimal as possible, so that itā€™s easily testable on Ubuntu 18+, or perhaps Windows with minGW.

The longer-term goal would be to make a composable system where the heavy processing is done in native code, with the distribution, coordination, and communication being handled by Elixirā€¦ possibly have Elixir also keep a copy of overall system state for high-level introspection.

The main advantage of this is that the BEAM automatically distributes tasks and handles the IPC, removing the burden of doing so from the C/C++ code (anyone whoā€™s tried to make anything more than arbitrary multi-threaded code probably has a bald spot that correlates with the number of threads youā€™re managing in your C/C++ code).

I did also go off into the weeds for a bit, messing about with D and its actor paradigm. It does bring some ideas that might be better implemented in C++ in a couple years. Furthermore, in my own personal work, Iā€™ve been putting together an agent/functional-based approach for systems development in C++, which might work out well too.

Should be a fun next few weeks!

5 Likes

Have you looked into handling the processing via Nx and the UI via LiveBook?

-r

Hi Rich!

I was happy/excited to see Nx come out earlier this year. For my own implementation in Elixir, Iā€™d be interested in trying to apply Nx to it, but have simply lacked the time after starting my current position ( working as VP of engineering at Sonasoft ).

Iā€™d be happy to return to my implementation and see where it might be possible to include Nx for the mathematical operations, though I suspect it will require a reshaping of the current application. Iā€™m also slightly intimidated by its heavy dependency on Tensorflowā€™s jit compiler, as my last experience with trying to use that in an ad-hoc manner probably lost me some of my hairline. :smiley:

If youā€™re inclined or interested, Iā€™d never hesitate to consider any pull requests if you want to have a hand at it.

Iā€™ve not taken a look at LiveBook, so Iā€™ll add that to my pile of things to check out.

1 Like