Animation of spatial pooler and temporal memory

I’ve just implemented a HTM region in JS. Although its not perfect I thought I’d share an animation that I think helps illustrate the learning process in the region. It is a very simple repeating alternating input pattern.

The top row is the input space. The cells in the columns span vertically. The proximal connections are a generalized sum of segment connections. Red is active, and yellow is predictive.

From the start the columns are bursting. As the columns compete, different sets of columns burst until the dominant columns are set. Then as segments form, the columns activate depolarized cells. Due to Hebbian backpropagation the segment synapses are pruned away to react to specific patterns.

I’ve still got more to do. As you might notice the global inhibition for SP is not working well for one of the input cells. I’ll need to implement local column inhibition. I may update this as I progress.

If anyone else notices any problems based on what they see here I’d appreciate feedback.
(click to enlarge repeating gif)

14 Likes

@sebjwallace,

I have to say, this way of visualizing inputs (for me) is a very immediately understandable presentation. I personally have been holding off on creating a visualization system because nothing had occurred to me as to how to compress visuals but still keep alive the laminar flow, connectivity steps and ability to picture the actual structure. I LOVE IT! This has definitely nudged me in what I perceive to be a positive direction! Thank you! …and congratulations on your experiment so far!

Ideas:

  1. Be able to specify selection states of individual or sub-groups of columns - showing their selection as highlighted members.
  2. Collect statistics in another panel for which cells and how many are active/depolarized etc.
  3. Scroll left<—>right to be able to show 2,048 columns (the default used by NuPIC).
  4. Ability to turn off connections which aren’t currently active.
  5. Show connections having the same colors as the indicated cell states (for a selectable subset such as all connected all depolarized etc.)

Great job! :slight_smile:

2 Likes

Thanks for the very encouraging comments!

You’ve given me a few ideas that I’d like to implement into my visualizations. I think colorizing the associated connections is the first thing on the list.

I’ll keep this thread updated!

3 Likes

I had a bit more time to play around with it. The active columns are colorized dark grey (red active cells) while the predicted columns are colorized light grey (yellow predicted cells). The colored connections are the connections propagating the action potentials from the active cell out to all its synapses . Instead of visualizing the segments I have the connections directly connecting to the cells.

Although the sequence is simple the establishment of dominant columns seems to take quite a number of steps. This is the sort of interesting observations that can come from visualizations.

(click to zoom into repeated gif)

5 Likes

@sebjwallace,

Nicely done. The colors give a more immediate sense of what’s going on. I like it!

Very cool, @sebjwallace! I love visualizations like this. You’ve found a great way to get across the state of the system in a simply fashion. Kudos!

Is your code open source? Also, since this visualization is in Javascript, I wonder how hard it would be to incorporate it into my visualization project (which connects to a server running NuPIC)?

1 Like

I very delighted you like it @rhyolight !

Its not open source, however I’d like to fork the visualization project and see what I can do with the NuPIC implementation.

I’ve done a few updates also. In this example the input sequence moves across to the right during 11 steps. This movement to the right helps illustrate the temporal transition from state-to-state. The input sequence repeats a few times.

The white lines represent the new connections made between the previous active cells to the current active cells. This is when the segments are established. The dark connections are the inactive connections. The active columns are white, the predicted are yellow and the faint white is the previously active columns. In this example there is no learning happening in the proximal dendrites so therefore distal learning is instant as SP is switched off.

In this example SP is switched on so therefore learning takes a bit longer. But it does show some other features. The blue columns are the false predicted columns with the blue false predicted cells. In this longer example there is more white connections as active columns are changing due to competition. In the last sequence the region has completely learned the pattern.

5 Likes

I’ve gone off on an experimental tangent by trying to implement a crude version of a Temporal Pooler. The lowest level of the TP has a single representation for the whole repeating sequence. The levels above that are subsequence representations.

The mechanism to implement the TP was very similar to that of TM. With TM distal connections depolarize cells then proximal connections activates them. With TP distal connections from the cells of columns depolarize the cells then another distal connection (from another segment / pattern) activates them. So if two predicted patterns correctly activate in sequence then it will depolarize then activate the TP cells in the two steps. This algorithm is repeated to the next TP layer/row.

Each pattern within the TM section (all active cells in all columns) is encoded as a distribution of 4 cells on the TP layer (4 cells is an arbitrary value). Like in TM that means that no single cell represents any single pattern, rather a distribution of cells represent a single pattern.

Again, this is a crude experimentation. There’s probably a better way to implement it.

(enlarge gif)

5 Likes

My experimental tangent continues. The Temporal Pooler recognizes sequences in Temporal Memory and acts almost like a recorder for consistent sequences. If any of the sparse patterns in TM that are apart of a sequence are recognized then a sparse pattern in TP is activated. In this case the sparse pattern is 4, which is an arbitrary value.

As seen in the gif, for each active sparse pattern among the active columns there is a segment that recognizes it. The segment depolarized the associated cell in TP then any activated segments after that continue to depolarize into hyperpolarization (activation).

There will be cases when similar sequences could share similar TM sparse patterns. Instead of a union of TP patterns, the TP cells compete for activation. If there were a set of cells in TP that responded to A,B,C,D and another set the responded to B,C,D,E then if a sequence of A,B,C were fed in then A,B,C,D will rival over B,C,D,E because it recognizes one more pattern in the sequence (A) even though they both recognize B,C. The competition is somewhat similar to SP.

The stable representation from TP could be combined and fed into another region and SP would just see it just like any other pattern. Essentially TP has collapsed a temporal pattern into a spatial pattern.

In this example there are two separate input sequences. For both of the sequences, two sets of sparse representations form winth in TP.

Again, this is a crude experimentation. I have no idea if anything like this is happening biologically.

(click to enlarge gif)

3 Likes

This visual shows the prediction of the complete sequence, inferred from the TP. When a sequence is recognized by a distribution of cells in the TP all the cells they are connected to are the cells that make up the whole sequence. These cells are indicated in green and the columns they belong are in a lesser green. As seen the in visual, the sequence is predicted and the actual activation fulfills it. This prediction>activation/deactivation could then lend to hebbian reenforcement like in SP.

Please, I’d love to hear any ideas from anyone.

(click to zoom gif)

4 Likes

@sebjwallace
That looks awesome! The predicted sequence looks like a union of all the transitions, which makes sense according to the pooling mechanism. But what if the sequence is longer? This prediction will contain more transitions and would become very dense, losing the properties of sparseness.

Do you think it’s possible that the predicted sequence can be split into the original transitions one by one? I think that at least biologically this should be possible because HTM theory mentions sequence recall and replay, but I don’t understand how the sequences would actually be replayed.

Also recall from the SDR Paper that subsampling would not significantly effect the potency of the pooled SDR, so in fact it may be possible to limit sparsity to a desired percentage by simply subsampling or removing arbitrary bits? Someone correct me if I’m wrong?

@dorinclisu I don’t believe density from larger sequences would be a problem. As @cogmission mentions below [quote=“cogmission, post:13, topic:1562”]
subsampling would not significantly effect the potency of the pooled SDR
[/quote]

Sparsity can be derived by simply subsampling the cells in the TM space of the sequence union. That’s the beauty of the large capacity property.

An idea I had about TP sparsity is that the active cells in TP (the layer below) is the SDR. For each sequence it recognizes there is always only ever 4 cells active at a time (4 is an arbitrary choice). This 4-celled SDR is the spatial representation of the temporal sequence. So instead of using the union of the cells in the TM sequence (with or without subsampling), the SDR in the TP can be used instead. Again, I have no idea if that actually reflects the biology.

I’m not sure if they can be split, but they can be composed hierarchically as the regions connect together. A TP SDR could be fed into another region’s SP and higher-level TP abstractions would be produced. As a guess, I think recall and replay is as simple as automatically activating predicted cells without proximal input. When you close your eyes and imagine something, there is no visual input - yet you are able to ‘see’ something. You are replaying sequences within TM without feedforward activity from proximal dendrites. I guess the thalamus has a lot to do with this (attention), but for practical purposes we could just automatically activate any predicted cells, then activate their predictions, etc. and that would play a sequence based upon initially conditions - sort of like a cellular automata.

I believe @cogmission is asking about subsampling the TM cells rather than the sparsity of TP representation. Would you lose much if you just sample a fixed sparsity of inputs from that union? Like bits and pieces from every TM representation.

There was a blog extensively experimenting about the recall by @floybix. His work shows it is possible in a detailed way and how it is possible.
http://viewer.gorilla-repl.org/view.html?source=gist&id=95da4401dc7293e02df3&filename=seq-replay.clj

Oh and @sebjwallace thank you for sharing a wonderful visualization. Just one nitpick; hyperpolarization is the exact opposite of depolarization. An active cell can be inhibited if it reaches an inhibitory spike amount. Just below that threshold it is hyperpolarization of the cell. It surprised me ass well when I found out, why would they call that hyperpolarization? :slight_smile:

2 Likes

Ah, my bad. I’ve come to associate hyperpolarization with activation as hyperpolarization occurs directly after the cell has depolarized enough to spike.

That’s what I thought too. So if you were to subsample say 40% of the light green cells in the predicted TM sequence then that’ll be fine. You could subsample a lower percentage if the number of cells in each column were increased. There are currently 32 cells in each column, but with 128 we could reach very good sparsity with large sequences.

Thanks for sharing that link about recall.

2 Likes

From @floybix link

A more natural and hopefully flexible approach is to activate predicted cells directly. I call this spontaneous activation: cells becoming active from distal/apical excitation alone, without proximal excitation. It is still the usual k-winners-take-all competitive selection. Once we select some cells to become active, they generate more lateral distal excitation (predictions), and so on; we iterate.

@sunguralikaan I’ve been reading from the link you sent me and if I read it properly it seems we have similar ideas of feedback replay. It supports an idea I had today. If the TP cells feed backward to the TM cells and set them in a depolarized/predicted state then a single feedforward input (beginning of sequence) will be enough to automatically cause a ‘chain reaction’ of activation of the whole sequence. More specifically, if a cell is depolarized from feedback then another spiking cell (from the previous step) further depolarized it then the cell will spike and further depolarize its targets cells that are already depolarized cells from feedback, and so on.

I might try and implement this to show you what I mean. It might then be interesting to hook it up between two regions and watch how feedback from the higher region can unfold sequences from a higher-level ‘concept’.

I know about subsampling but I was thinking about the extreme case of say, a sequence of 50 orthogonal SDR’s each 2% sparse saturating all the cells in TM, then no subsampling can do anything useful. I suppose such a saturation is extremely unlikely, but perhaps there is also an additional mechanism at work - like some sort of “online” inhibition, where older SDR’s in the sequence would fade away and not be “remembered” by the pooler?

1 Like

You’re probably right about that. I could only guess myself. Hopefully we’ll get some more clues from biology soon.

That experimentation does this if I am not wrong. Also, what you are talking about actually sounds like the union pooling theory of HTM.

This is done through apical dendrites between layers in that theory.

1 Like

This is a very cool visualization - nicely done! It also clearly shows the transitions and the process of going from bursting to sparse predicted cells.

4 Likes