If you use this software in any publication, please refer to the following article:
D. Benbouzid,
R. Busa-Fekete, N. Casagrande, F.-D. Collin, and B. Kégl
MultiBoost: a multi-purpose boosting package
Journal of Machine Learning Research, 13:549–553, 2012.
Basic examples
This simple example shows how AdaBoost.MH can be trained using various base learners, such as (real-valued) decision stumps, (categorical) subset indicators, decision trees, and decision products, on the UCI pendigits dataset. The command
./multiboost --fileformat arff --traintest pendigitsTrain.arff pendigitsTest.arff 100 --verbose 3 --learnertype SingleStumpLearner --outputinfo resultsSingleStump.dta --shypname shypSingleStump.xml
runs AdaBoost.MH over 100 iterations using decision stumps as base learners. The model will be saved into the shypSingleStump.xml file. The default metrics will be calculated for each iteration and stored in the resultsSingleStump.dta file. The command
./multiboost --fileformat arff --traintest trainNominal.arff testNominal.arff 100 --verbose 3 --learnertype IndicatorLearner --outputinfo results.dta --shypname shyp.xml
runs AdaBoost.MH with subset indicators as base learners. The command
./multiboost --fileformat arff --traintest pendigitsTrain.arff pendigitsTest.arff 100 --verbose 3 --learnertype TreeLearner --baselearnertype SingleStumpLearner 8 --outputinfo resultsPendigitTreeLearner.dta --shypname shypTree.xml
runs AdaBoost.MH with decision trees of (at most) 8 leaves, whereas the command
./multiboost --fileformat arff --traintest pendigitsTrain.arff pendigitsTest.arff 100 --verbose 3 --learnertype ProductLearner --baselearnertype SingleStumpLearner 3 --outputinfo resultsPendigitProductLearner.dta --shypname shypProduct.xml
runs AdaBoost.MH with decision products of 3 terms.
FilterBoost and LazyBoost
The strong learner can be set by using the --stronglearner argument. The command
./multiboost --fileformat arff --traintest pendigitsTrain.arff pendigitsTest.arff 100 --verbose 3 --learnertype SingleStumpLearner --outputinfo resultsSingleStump.dta --shypname shypFilterSingleStump.xml --stronglearner FilterBoost
runs FilterBoost.
LazyBoost [
1] can be run by choosing AdaBoostMH as the strong learner and setting the --rsample parameter to the number of features to be sampled in each iteration. The command
./multiboost --fileformat arff --traintest pendigitsTrain.arff pendigitsTest.arff 100 --verbose 3 --learnertype SingleStumpLearner --outputinfo resultsSingleStump.dta --shypname shypLazySingleStump.xml --rsample 100
runs LazyBoost with 100 features sampled in each iteration.runs FilterBoost.
Haar filters
The command
./multiboost --fileformat arff --traintest uspsHaarTrain.arff uspsHaarTest.arff 100 --verbose 3 --learnertype TreeLearner --baselearnertype HaarSingleStumpLearner 16 --csample num 100 --iisize 15x15 --seed 71 --outputinfo result.dta --shypname shyp.xml
runs AdaBoost.MH with stumps over Haar filters as base learners. The uspsHaarTrain.arff and uspsHaarTest.arff files contain integral images. The --csample argument sets the number of filters sampled in each iteration. The --iisize option specifies the size of the image. The --seed argument intializes the random number generator. For more information about Haar Filters, see [
2].
Using bandits to accelerate AdaBoost
Bandit boosting ([
3,
4]) is an alternative to LazyBoost that makes feature sampling more efficient. The command
./multiboost --fileformat arff --traintest pendigitsTrain.arff pendigitsTest.arff 100 --verbose 3 --learnertype BanditSingleStumpLearner --outputinfo resultsBandits.dta --shypname shypBanditSingleStump.xml --banditalgo UCBK --updaterule logedge
boost a UCB-based decision stump base learner.
./multiboost --fileformat svmlight --traintest train.txt test.txt 1000 --verbose 5 --learnertype SingleSparseStumpLearner --outputinfo resultsSparseStump.dta --constant --shypname shypSparseStump.xml --weightpolicy proportional
runs AdaBoost.MH using decision stumps implemented for sparse features. The base learner is usually coupled with the svmlight input format which allows storing sparse data. The command
./multiboost --fileformat svmlight --traintest train.txt test.txt 1000 --verbose 5 --learnertype BanditSingleSparseStump --outputinfo resultsBanditsSparseStump.dta --constant --shypname shypBanditsSparseStump.xml --weightpolicy proportional --banditalgo EXP3P --updaterule logedge --rsample 2
runs the same learner but uses a bandit-based acceleration.
References
[1] G. Escudero, L. Màrquez, and G. Rigau. Boosting applied to word sense disambiguation. In Proceedings of the 11th European Conference on Machine Learning, pages 129–141, 2000.
[2] P. Viola and M. Jones. Robust real-time face detection. International Journal of Computer Vision, 57:137–154, 2004.
[3] R. Busa-Fekete and B. Kégl. Accelerating AdaBoost using UCB. In KDDCup 2009 (JMLR W&CP), volume 7, pages 111–122, Paris, France, 2009.
[4] R. Busa-Fekete and B. Kégl. Fast boosting using adversarial bandits. In International Conference on Machine Learning, volume 27, pages 143–150, 2010.