Examples

If you use this software in any publication, please refer to the following article:

D. Benbouzid, R. Busa-Fekete, N. Casagrande, F.-D. Collin, and B. Kégl
MultiBoost: a multi-purpose boosting package

Journal of Machine Learning Research, 13:549–553, 2012.

Basic examples

This simple example shows how AdaBoost.MH can be trained using various base learners, such as (real-valued) decision stumps, (categorical) subset indicators, decision trees, and decision products, on the UCI pendigits dataset. The command

./multiboost --fileformat arff --traintest pendigitsTrain.arff pendigitsTest.arff 100 --verbose 3 --learnertype SingleStumpLearner --outputinfo resultsSingleStump.dta --shypname shypSingleStump.xml

runs AdaBoost.MH over 100 iterations using decision stumps as base learners. The model will be saved into the shypSingleStump.xml file. The default metrics will be calculated for each iteration and stored in the resultsSingleStump.dta file. The command 

./multiboost --fileformat arff --traintest trainNominal.arff testNominal.arff 100 --verbose 3 --learnertype IndicatorLearner --outputinfo results.dta --shypname shyp.xml

runs AdaBoost.MH with subset indicators as base learners. The command 

./multiboost --fileformat arff --traintest pendigitsTrain.arff pendigitsTest.arff 100 --verbose 3 --learnertype TreeLearner --baselearnertype SingleStumpLearner 8 --outputinfo resultsPendigitTreeLearner.dta --shypname shypTree.xml

runs AdaBoost.MH with decision trees of (at most) 8 leaves, whereas the command 

./multiboost --fileformat arff --traintest pendigitsTrain.arff pendigitsTest.arff 100 --verbose 3 --learnertype ProductLearner --baselearnertype SingleStumpLearner 3 --outputinfo resultsPendigitProductLearner.dta --shypname shypProduct.xml

runs AdaBoost.MH with decision products of 3 terms. 

FilterBoost and LazyBoost

The strong learner can be set by using the --stronglearner argument. The command 

./multiboost --fileformat arff --traintest pendigitsTrain.arff pendigitsTest.arff 100 --verbose 3 --learnertype SingleStumpLearner --outputinfo resultsSingleStump.dta --shypname shypFilterSingleStump.xml --stronglearner FilterBoost

runs FilterBoost.

LazyBoost [1] can be run by choosing AdaBoostMH as the strong learner and setting the --rsample parameter to the number of features to be sampled in each iteration. The command 

./multiboost --fileformat arff --traintest pendigitsTrain.arff pendigitsTest.arff 100 --verbose 3 --learnertype SingleStumpLearner --outputinfo resultsSingleStump.dta --shypname shypLazySingleStump.xml --rsample 100

runs LazyBoost with 100 features sampled in each iteration.runs FilterBoost.

Haar filters

The command 

./multiboost --fileformat arff --traintest uspsHaarTrain.arff uspsHaarTest.arff 100 --verbose 3 --learnertype TreeLearner --baselearnertype HaarSingleStumpLearner 16 --csample num 100 --iisize 15x15 --seed 71 --outputinfo result.dta --shypname shyp.xml

runs AdaBoost.MH with stumps over Haar filters as base learners. The uspsHaarTrain.arff and uspsHaarTest.arff files contain integral images. The --csample argument sets the number of filters sampled in each iteration. The --iisize option specifies the size of the image. The --seed argument intializes the random number generator. For more information about Haar Filters, see [2].

Using bandits to accelerate AdaBoost

Bandit boosting ([3, 4]) is an alternative to LazyBoost that makes feature sampling more efficient. The command 

./multiboost --fileformat arff --traintest pendigitsTrain.arff pendigitsTest.arff 100 --verbose 3 --learnertype BanditSingleStumpLearner --outputinfo resultsBandits.dta --shypname shypBanditSingleStump.xml --banditalgo UCBK --updaterule logedge

boost a UCB-based decision stump base learner. 

Sparse data

The command 

./multiboost --fileformat svmlight --traintest train.txt test.txt 1000 --verbose 5 --learnertype SingleSparseStumpLearner --outputinfo resultsSparseStump.dta --constant --shypname shypSparseStump.xml --weightpolicy proportional

runs AdaBoost.MH using decision stumps implemented for sparse features. The base learner is usually coupled with the svmlight input format which allows storing sparse data. The command 

./multiboost --fileformat svmlight --traintest train.txt test.txt 1000 --verbose 5 --learnertype BanditSingleSparseStump --outputinfo resultsBanditsSparseStump.dta --constant --shypname shypBanditsSparseStump.xml --weightpolicy proportional --banditalgo EXP3P --updaterule logedge --rsample 2

runs the same learner but uses a bandit-based acceleration.

References

[1] G. Escudero, L. Màrquez, and G. Rigau. Boosting applied to word sense disambiguation. In Proceedings of the 11th European Conference on Machine Learning, pages 129–141, 2000.

[2] P. Viola and M. Jones. Robust real-time face detection. International Journal of Computer Vision, 57:137–154, 2004.

[3] R. Busa-Fekete and B. Kégl. Accelerating AdaBoost using UCB. In KDDCup 2009 (JMLR W&CP), volume 7, pages 111–122, Paris, France, 2009.

[4] R. Busa-Fekete and B. Kégl. Fast boosting using adversarial bandits. In International Conference on Machine Learning, volume 27, pages 143–150, 2010.
SelectionFile type iconFile nameDescriptionSizeRevisionTimeUser
ċ

Download
  574k v. 1 Aug 16, 2011, 4:13 PM Balazs Kegl
ċ

Download
  408k v. 1 Apr 9, 2010, 10:53 PM Robert Busa-Fekete
ċ

Download
  184k v. 3 Mar 31, 2010, 5:58 AM Robert Busa-Fekete
ċ

Download
  2014k v. 1 Apr 10, 2010, 3:50 AM Robert Busa-Fekete
ċ

Download
  4190k v. 1 Mar 31, 2010, 6:33 AM Robert Busa-Fekete