As read in this discussion between @breznak and @scott, the claim that boosting is broken is currently disputed. I would say the jury is out on this until we have proof that a change in boosting can improve performance of the algorithms.
TL;DR of the discussion (please go in for details as I’ll simplify), I claimed boosting is “broken” as it causes artificial rises in activation of the columns (that’s what it should do), but that leads to fake anomaly scores (FP anomaly error). @scott countered it’s what boosting is supposed to do.
So, how to test the impact?
I’ve replicated the issue on a simple dataset (sin) where it is an issue.
an indirect proof is that boosting is disabled (maxBoost=1.0) set be swarming for optimal params, see NAB
a real-world test would be testing on NAB with maxBoost=1.0 and =10.0
I don’t have NAB running here now, can someone replicate the experiment and confirm/deny the performance impact?
Then the default should be 1.0 as you suggest, + there should be a big fat warning in the docs for the params about this (complex enough only) dataset.
Now, can we test it then? Do you know of a dataset that is “complex” enough to benefit from the boosting? If there’s no such dataset, I’d even say to go further and remove the boosting code;
Sidenote, irrelevant of the positive/negative impact of boosting, a part of the PR were an implementation of boosting that would not have the negative property of “visible disruptions”
Sure, spatial pooling with 2D topology on a large set of natural images would be a good set. Setting up the task to do this properly requires some thought.
That would be fine, but a PR doesn’t cut it in my opinion. It should be tested outside somewhere (such as nupic-community) in a form that others can run and see the results. This would be a non-trivial change to a core part of NuPIC - so it needs to be well described and tested on a bunch of data. For a core algorithm like spatial pooling, we also need to make sure it is something that is plausible biologically. Then a proposal can be made to incorporate it into NuPIC. (This is the same process we use ourselves - we test and document in nupic.research first.)
That is a very good question. I don’t know that we have algorithm level tests of boosting (it hasn’t really been a focus of ours). I think we can come up with a list of desired properties and then test to see whether the current implementation matches those properties.