Yes, that’s the plan. I’ve already hacked together a backend that is storing the active duty cycles and and overlap duty cycles. The boosting is all tied into inhibition, right? It seems simple enough with global inhibition, but local inhibition complicates everything. I’m considering explaining global inhibition and boosting in the episode 9, then topology and local inhibition in episode 10 (episode 8 is about learning).
Just a quick suggestion @rhyolight, it may be better to include bumping along with boosting. Bumping seems to be more valuable in terms of creating the competition from what I observe. Usually, I turn off the boosting for more stable representations in the short term because of the things you mentioned and just work with bumping.
For the sake of same understanding about these terms:
The artificial overlap increase to force a less used column to be active. This would allow the less used columns to survive after inhibition which would help them adapting their synapses to input patterns because only activated columns are allowed to do this. The goal is to get these columns activated without the help of boosting in the future. Active duty cycles is the main metric for this.
Increasing the permanence of all the synapses of less used columns until the moving average of the overlap is above a threshold. This does not enforce an activation but helps more synapses to become connected which would indirectly lead to column becoming active on its own in the future because of increased input reception. Overlap duty cycles is the main metric for this.
@sunguralikaan - non-spiking adaptation is plausible and has been discussed before. I’m not sure if there were any concrete objections other than it not making too big of an impact overall. The main negative impact is in anomaly detection so it would be interesting to see if non-spiking adaptation improved NAB results.
Clever ideas, but I need to stick with Numenta’s current implementation of HTM for these videos. Not to say that other algorithms might perform better in different situations, however.
I think @sunguralikaan was referring to the current implementation. Specifically,
Is this a normal part of boosting? Or is this some kind of alternative method? Sorry, I wasn’t planning on researching boosting thoroughly until next week!
One thing to keep in mind: If there aren’t enough unique inputs, boosting will wreak havoc.
An easy example: my “trivial sequence”: http://mrcslws.com/blocks/2016/03/13/column-overlaps-and-boosting.html
I have 25 unique inputs, 40 active columns, and 2048 columns. That means at most 25 * 40 = 1000 columns will be used. So at any given time, at least 1048 columns are getting starved, and boosting is going to teeter-totter the representations.
I’m not sure if you’re running into this problem, but you might be.
@mrcslws - I thought there was specific logic to catch that case. So if columns are perfect matches then ignore other columns that were boosted or something like that.
@rhyolight The visualizations are very nice! I am also studying the effect of boosting during SP learning. You used the term “boosted period” (or when boosting occurs) in the video, I assume this is the period where the output of the SP is drastically different from previous time steps with the same input, right? Do you have boosting on throughout the video?
I think this oscillatory behavior is due to how boostFactors are updated in SP. boostFactors are initialized to be all ones for all the columns. The boostFactors keeps increasing if a column is not active, and they get reset to 1 once the columns become active. I can see how this scheme could lead to oscillations in the SP output. If you monitor the boostFactors over training, will you see an increase in the average boostFactors during the boosted period?
Also, you mentioned that after the boosted period, the SP output becomes reliable again. This is expected because boosting does not necessarily lead to much synaptic permanence changes, especially if you have small increment/decrement values. If a column becomes active due to boosting, it is likely to be active for only a very short time period, because the boostFactor will quickly get reset. I don’t think it can learn much during this short window.
Anyway, I am still trying to understand boosting. I understand the logic of encouraging a distributed representation through boosting, but I am not sure whether it actually does the job.
Yes, you’re right. And I’m not sure what you mean by “have boosting on”. The only thing I did differently was set
2.0. I watched it play out, and I didn’t see a disruption like that again. So I assume boosting only occurred once. Honestly, I never understand this stuff until I do a visualization. Hopefully I can come up with one to see boosting live in action even better.
@sunguralikaan Sorry about the ignorant post above. I haven’t studied boosting much yet, and now that I’ve re-read your post, it’s very helpful. Thanks for that!
I’m now also learning about boosting in SP, and I’ve read this in the code:
* boostFactor * ^ maxBoost _ | * |\ * | \ * 1- | \ _ _ _ _ _ _ _ * | * +--------------------> activeDutyCycle * | * minActiveDutyCycle
is it a representation of the change of boost factor in 1 column over iterations? And how can we set the “minActiveDutyCycle” parameter? Besides, the value of boost factor for each column is not linearly decrease or increase linearly as described, it only has two values 1 or maxboost. I have a lot of confusion about this.
Hi @nluu. Yes, that sounds right.
For global (simple) inhibition: During each SP compute iteration, you’ll calculate
ActiveDutyCycle values for all Columns in the region. Take all these values, and find the mean average. You can use this value as
minActiveDutyCycle to get started.
You can see how NuPIC does it here (including local inhibition; named
That chart doesn’t look formatted correctly, I’ll link to a better version here:
The main ideas the chart is trying to express:
- If your column’s
activeDutyCycleis very low (2%), it’s
boostFactorwill be very high (greater than 1).
- Oppositely, a high
activeDutyCycle(90%) will result in a small
boostFactor(nearer to 0).
activeDutyCycleis a middle value, it’s
boostFactorwill be nearly equal to 1.
- The change between the above states is exponential and continuous, not linear or discreet.
Can I know the reason to say that BOOSTING is the reason for learning granularity information (as mentioned in BOOSTING episode of HTM school) and not SPATIAL POOLING LEARNING? Even during learning phase of spatial pooling, the learning happens. So, the columns could have learnt about time granularity (from boosting episode again) when the spatial pooler learns…right? Whats the reason to say time granularity was learnt only during boosting?
@baymaxx boosting in the mechanism which the SpatialPooler relies on to allowing every column to express itself. Otherwise cells can get domination and never turns off. Which bad thing that said cell through learning ended up carrying no information.
The actual learning is still done by the SP’s learning algorithm. Boosting is only assisting the SP to learn better.
You can read more about how boosting works in my blog post (the post me implementing Numenta’s other
paper, but he boosting part is the same thing)
The Boosting section is the only retentive one this context
Hi all, I’m now still having some confusion about the affect of boosting on result of HTM learning. As
I understand, according to the boosting, With same Input vector, after some iterations, the output (active column) of the Spatial Pooling will be changed. If I use these results to give it as an input for Temporal Memory, it will return different prediction result. So it is not what we want. please correct me if I was wrong. Thank you so much.
Learning in SP adapts the permanence values between each SP column and its receptive field of encoding bits, so yes the same raw input that activates the same encoding bits can activate different sets of SP columns at different times.
However the 2 set of SP columns representing the same input at different times should overlap quite a bit, unless the SP learning rates are so high as to change these permanences too drastically and basically destabilize the SP. So it seems this could potentially become an issue.
i have noticed the same issue as @nluu. As @sheiser1 has explained, this is all fine and correct. However, in the real life, bosting seems to be problematic.
For example, SP is learn a pattern very quickly in just few steps. However, once a boosting became active, SP briefly “forgets” all learned patterns and start learning again.
At some point of time (in some learning iteration step) SP learns patterns again and remains stable, until bosting becomes active again.
In real world applications, this means following:
- Application learns patterns.
- Bosting becomes active and SP forgets learned patterns
To recap, boosting is biologically cool, but how it is helpful in real world applications?
Do I see that wrong?
Currently discussing log boosting and its plausibility and impact on MNIST, and @vpuente shared great biological detail why boosting is used, but probably only in the early stages.
Could someone please try to replicate @rhyolight 's results with log boosting? The formula is simple
@mrcslws Thanks for sharing this, this dates ages back to a problem some NASA guy raised, a SP with boosting on a “dummy” input sequence will produce periodic boosts in activity. Which is unwanted.
Here by “inputs” you mean actual 25 different iptut patterns, not any bits, right?
Basically the issue is with over-provisioned SP.
I’d like to make this into a test.
My takeaway is, I understand what boosting should do and why it helps sometimes, on the other hand I don’t see clearly how it’s done biologically. And we have the edge cases here where boosting hurts.
What do you think of my proposal to remove boosting as is now, and replacce the functionality with synaptic death? Unused synapses are removed (solves over-provisioned issue above), this avoids “weak” columns by simply killing them off. Now we need a mechanism to add new column when all other are busy (well estabilished, connected) and we still lack the precision.
I’ve posted some proposals to diminish the negative impact of boosting