Boosting in HTM and new ways

I don’t know :slight_smile: This is what I’m doing.

The formula is not the same. DutyCycles is computed using a moving average (not an exponential moving average). Nupic also does _bumpUpWeakColumns

Exponential moving average might be a better choice.

bumpUpWeakColumns is an additional mechanism on top of boosting. It reinforces (increments) all the proximal synapses of columns that have lower average input compared to the others. Personally I found this to be inadequate, so I bump all the synapses also by the boosting factor (not the average input) in my implementation.

No, I’m talking about spreading the meaning out to more minicolumns in the SP, not the inclusion of more semantic info. The efficiency means there is less overloading of meaning in the used minicolumns because more minicolumns are used. The storage of meaning stays the same, it is just spread out.

I’m not sure what you mean. Boosting does not introduce instability.

Oh I see. In what way is this overloading bad?

I meant, for repetitive sequences, this will lead to same input being represented differently

1 Like

To clarify @abshej’s point here a little more… essentially when using boosting in a system which wouldn’t normally utilize all of the minicolumns, you end up with a larger/ more dense SDR of minicolumns representing a particular input, where only a subset of that larger representation is active at a given time. This means there is a greater chance that the distal connections made during TM for a particular cell when one of the representations was active will be below the threshold when a different one is active. This means potentially more bursting and more cycles to stabilize on a particular input + context representation.

Picking the right boosting factor is essentially a balancing act between overall capacity and cycles required to form stable representations in TM.

4 Likes

Are you saying this is the instability? Please elaborate.
This would end up in different SDRs for same patterns in some cases. So this pattern will have to be learned again.
I thought, since boosting is gradual, even if the representation of an input changes over time, it will be slightly different, but this cannot be a necessary scenario since the repetition of the same input can occur anytime, after winning columns have been influenced considerably by the boost factors. And how exactly does this help in recognition of previous instances more precisely, as shown in the video?
I am talking about instability in SP outputs.

Instability in SP outputs of course leads to instability in TM outputs, so I think we are talking about the same issue. Since TM is the ultimate goal, that is why I clarified that facet of the issue.

Anyway, if you keep the boosting factor low enough, then that reduces instability in TM as long as enough of the distal synapses are active as the minicolumns for previous input change, so that TM learning adapts to cover more of the larger representation.

Agreed. But the instability in TM’s distal connections isn’t my focus. Wouldn’t disabling boosting be a better option since we cannot predict if enough distal synapses will be active, as that depends on when the input repeats and new connections are formed, along with the activation of other minicolumns(during previous and later inputs), which influence the boosting factors.

And how does the overall mechanism help in more precise recognition of previous occurrences of input if new(different) semantic information is not added because of selection of other columns due to bursting only?

1 Like

Sure, disabling boosting is a perfectly valid option for some scenarios. What boosting brings to the table is greater capacity for semantically similar contexts in TM. Without bosting, a smaller set of minicolumns is used to represent all contexts of a particular input. With boosting, there is a larger set of minicolumns, thus capacity to learn more contexts of a particular input. That greater capacity comes at the cost of greater instability.

4 Likes

I would add that boosting can also result in finer columnar representation. Here is a scenario I encounter from time to time:

A minicolumn can only update its proximal synapses if it is active. Imagine a scenario where minicolumn A is active on both input patterns X and Y because of its existing potential connections to both. This happens very frequently due to initial synapse configurations. Depending on the increment (say 300) and decrement (say 100) parameters, this minicolumn may keep getting active on both of these inputs. There is no way of making sure that the column activates on only one of the patterns with these parameters. If boosting is enabled, the overlap of this column decreases giving chance to another column B getting active. This newer column now learns patterns X and Y as the boosting allows. A and B both reinforce their connections to X and Y, so B now competes with A for activation on X and Y. I regularly observe that this competition results in specializing to one of the inputs because of columns having different potential synapse pools and the synaptic update rules. Previously you had column A representing both input patterns X an Y. With boosting on, A represents X and B represents Y.

If you set parameters to be decrement>=increment (for example 100 and 100) you can make sure a column always represents a single input. However, you now lost spatial generalization capacity because no column would overlap on representations of two different input patterns.

TL;DR Previously you had column A representing both input patterns X an Y. With boosting on, A represents X and B represents Y.

Edit: decrement<increment should’ve been decrement>=increment.

3 Likes

Is there a biological correlate to boosting?

What I would like to know as well, since homeostatis alone cannot explain boosting, but I haven’t read about it in detail.
At first I thought homeostatis is the maintenance of overall sparsity and nothing more.

Also, can anyone explain why boosting would lead to same semantic information being represented in a different distribution? A minicolumn can be replaced as the winner by another which doesn’t share spatially close receptive field, using boosting, right?

refractory period

If X and Y appears close in time…

Why?

If X appears far before, I don’t see how it can affect Y representation. I think that boost algorithm will be unaware of the similitud between X and Y (unless both appears within the window time used to compute the activation frequency of each minicolumn).

This is in response to “refractory period” and “If X and Y appears close in time.”
Once you bring in time you have to look at the coupling of HTM theory to time. (The T of HTM)
The entire thing about columns is change detection - x changes to y. Implicit in the HTM theory is that the column enters a predictive state and then fires triggering learning. In HTM cannon this period is set to one timestep. The refractory period is also within one timestep in HTM cannon.

Elsewhere in this forum, we discussed the decay of the predictive state over time. If there is partial depolarisation that does not result in an action potential there is a uniform decay that does not seem to be related to the refractory period; different mechanisms. In biology this was NOT restricted to a single timestep but to the best of my knowledge is not used in HTM theory in any way.

The spreading of activation (or overactivation) is a completely different thing - spatial and not temporal.

I think in SP there is no predictive state: the mini-column is active if wins the inhibition. I can’t see the relation with depolarization decay. In any case, hyper-polarization state seems to last a substantial amount of time. My understanding is that a neuron can’t fire again from such state, and only depends on their own membrane ion-pumps to recover from it.

In this particular case (intuitively) is that one cell can’t fire with a period bellow the refractory period (in either pyramidal cells or interneurons). In some way, the inhibition process is not “state-less” (like the plain SP assumes). There is a short term influence of the recent past, i.e. if you hammer the granular layer with similar “inputs” the firing pattern will spread out.

Perhaps I’m wrong…

As long as column A is over activated there will be another column which has increased chance of activation. As long as they try to compete with each other they will try to share activation among the input patterns. I am not sure about the necessity of X appearing close to Y in this case. Column A will be over activated even if only X occurs frequent enough.

Yes: i think that you are right.

Nevertheless, my point was that boost not necessarily will separate similar input values: it depends when the values are initially exposed to the system.

In any case, static tie breakers (or stable sort, like is used now) might artificially increase the activation frequency of some columns. I don’t know if the boost could be a “practical” solution to alleviate this issue…