The swarming process generates many candidate models each with different hyperparameter value combinations, and evaluates those models over the course of N time steps. So the larger N is, the more thorough an evaluation each candidate model gets. Given this, it’d be theoretically ideal to make N as large as possible so each candidate model has maximal time to prove it’s worth.

The major problem with large N is that it can make the whole process take a REALLY long time, since each candidate model needs to be initialized, hit with N data points and evaluated from N MAPE values I believe (mean avg % error). This can be mitigated though by setting the swarm size to ‘small’ or ‘medium’ at most, which caps the number of candidate models by limiting the set of different hyperparameter combos to try.

The other major * in my mind about swarming is that it is evaluating each model by one specific criteria: how well did this model forecast the value of metric *X1*? Swarming helps to find model configs with the sole goal of minimizing forecasting MAPE for *X1*, so no guarantees for *X2* or any other metric nor anomaly detection performance, since that’s a different objective.

If the statistics of the data have changed so much that the model hyperparameters are no longer valid, yes that’d theoretically call for re-running the swarm. But this should hopefully be quite unlikely barring some kind of tectonic shift in the data, where instead of *X1* ranging mostly from 0 to 1 it is now 100 to 1000 or something.

Having a larger N in your swarm should make this scenario less likely since the chosen candidate model is vetted on more data, though if your N is 3000 and the tectonic shift happened only after 3000 I suppose the issue would persist.

One other thing that running the swarm periodically would do is limit the ability for continuous learning, since each new swarm run would yield a new-born model to train from 0. You’d also need to introduce some criteria to trigger a new swarm, having declared the current model obsolete. Barring that tectonic shift scenario the continuous learning nature of HTM should make this unnecessary!