Problems with boosting?

rhyolight · May 25, 2016, 2:52pm

Continuing the discussion from Why maxBoost is always 2.0?:

Marek (@breznak) brought up that…

(emphasis mine)

I found these issues:

SpatialPooler boosting: is called each run, not on dutyCycle only (needs discussion of the issue on the ticket, no comments yet from other committers)
Boosting creates disturbance in anomaly detection (a disputed claim, see comments)
Boosting is causing troubles/ disturbances (great initiative from Marek to illustrate the problem in an experiment, hopefully this continues)

As read in this discussion between @breznak and @scott, the claim that boosting is broken is currently disputed. I would say the jury is out on this until we have proof that a change in boosting can improve performance of the algorithms.

breznak · May 25, 2016, 4:04pm

TL;DR of the discussion (please go in for details as I’ll simplify), I claimed boosting is “broken” as it causes artificial rises in activation of the columns (that’s what it should do), but that leads to fake anomaly scores (FP anomaly error). @scott countered it’s what boosting is supposed to do.

So, how to test the impact?

I’ve replicated the issue on a simple dataset (sin) where it is an issue.
an indirect proof is that boosting is disabled (maxBoost=1.0) set be swarming for optimal params, see NAB
a real-world test would be testing on NAB with maxBoost=1.0 and =10.0

I don’t have NAB running here now, can someone replicate the experiment and confirm/deny the performance impact?

subutai · May 25, 2016, 4:15pm

For simple streaming data, including all the NAB data files, maxBoost should be set to 1.0. It should only be greater for more complex datasets.

Before we can say that boosting is broken, we need to test it on data streams that really require it.

If you test it on NAB or simple datasets with maxBoost=10.0, you will get lots of problems, but that is totally expected.

breznak · May 25, 2016, 4:20pm

Then the default should be 1.0 as you suggest, + there should be a big fat warning in the docs for the params about this (complex enough only) dataset.

Now, can we test it then? Do you know of a dataset that is “complex” enough to benefit from the boosting? If there’s no such dataset, I’d even say to go further and remove the boosting code;

Sidenote, irrelevant of the positive/negative impact of boosting, a part of the PR were an implementation of boosting that would not have the negative property of “visible disruptions”

subutai · May 25, 2016, 4:58pm

Sure, spatial pooling with 2D topology on a large set of natural images would be a good set. Setting up the task to do this properly requires some thought.

That would be fine, but a PR doesn’t cut it in my opinion. It should be tested outside somewhere (such as nupic-community) in a form that others can run and see the results. This would be a non-trivial change to a core part of NuPIC - so it needs to be well described and tested on a bunch of data. For a core algorithm like spatial pooling, we also need to make sure it is something that is plausible biologically. Then a proposal can be made to incorporate it into NuPIC. (This is the same process we use ourselves - we test and document in nupic.research first.)

floybix · May 27, 2016, 2:10am

If you could point me to any tests and results for the existing boosting mechanisms I would appreciate it…

subutai · May 27, 2016, 11:37pm

That is a very good question. I don’t know that we have algorithm level tests of boosting (it hasn’t really been a focus of ours). I think we can come up with a list of desired properties and then test to see whether the current implementation matches those properties.

Topic		Replies	Views
Why maxBoost is always 2.0? NuPIC boosting	10	1718	February 3, 2020
Introduce new spatial pooler boosting rule (breaking change) NuPIC	4	1207	December 10, 2016
Breaking change in 0.5.0: maxBoost vs boostStrength NuPIC	0	468	February 2, 2017
Understanding Boosting in Spatial Pooler Engineering	3	1134	August 8, 2016
Question about TM stability, efficiency and boosting Getting Started spatial-pooling , question	7	688	August 27, 2020

Problems with boosting?

Related topics