Why is the default number of columns so high?

I’ve done some hyperparameter optimisation on a temporal prediction task (203k training rows, RMSE is on 105k testing rows) and found that the best predictions are not necessarily gained when the number of columns is 2048 (which is the default). I’ve done optimisation over most of the available parameters and here’s the result:

I can provide the dataset of all the models, but it looks like the lowest RMSE I can get for this task is ~20.

The downside to so having so many columns is that the model takes longer to run. So why are we given so many columns by default?

The default value is a “safe” number that has high capacity, and in a very good range (per SDR math). It should work well for a very wide spectrum of tasks. It allows us to avoid optimizing parameters for every dataset.

Of course, for any given task you might be able to run faster by using smaller numbers. (E.g. you could also reduce the number of cells per mini column.)

3 Likes