Benchmarks

If you use this software in any publication, please refer to the following article:

D. Benbouzid, R. Busa-Fekete, N. Casagrande, F.-D. Collin, and B. Kégl
MultiBoost: a multi-purpose boosting package

Journal of Machine Learning Research, 13:549–553, 2012.

This page contains test error rates and learning curves on commonly used benchmark data sets.

For now we are re-running the experiments from our ICML'09 paper with the hyperparameters validated there but using the new tree learner. We foresee to describe the full validation procedure along with the validation results for reproducibility, and run the setup on a larger pool of data sets.

The error rates (in %) in the table are computed by averaging the running error rates of last 20% iterations. For trees and products we see no overfitting (the learning curves converge to their best values), so the procedure simply smooths the learning curves. When overfitting occur (in the case of stumps), the error obtained in this way is somewhat pessimistic; we will replace the error by validating the number of iterations as soon as the validation will finish.

The MultiBoost commands generating the learning curves (including the hyperparameters) can be seen by holding the pointer above the error rates. The learning curves can be obtained by clicking on the error rates.

In the case of 10-fold error rates, the hyperparameters were validated separately for each fold. And in each fold a second 10-fold cross-validation were carried for selecting the hyperparameters, i.e. iteration number and the hyperparameter of base learner. In the figures the colored balls on the learning curves represent the validated iteration number for the different folds. The error rates for a certain iteration number is the average of the error rates of the 10 fold. When we calculated the average of the error rate, we kept the validated error rate if the iteration number is bigger than the validated iteration number. This is the reason why the learning curves are horizontal after the last validated iteration number.