Annals of Mathematics and Artiﬁcial Intelligence 41: 95–109, 2004.  2004 Kluwer Academic Publishers. Printed in the Netherlands. Improvement of boosting algorithm by modifying the weighting rule Masayuki Nakamura, Hiroki Nomiya and Kuniaki Uehara Graduate School of Science and Technology, Kobe University, Nada, Kobe 657-8501, Japan E-mail: m-yuki@hcc1.bai.ne.jp; nomiya@ai.cs.scitec.kobe-u.ac.jp; uehara@kobe-u.ac.jp AdaBoost is a method for improving the classiﬁcation accuracy of a given learning algo- rithm by combining hypotheses created by the learning alogorithms. One of the drawbacks of AdaBoost is that it worsens its performance when training examples include noisy exam- ples or exceptional examples, which are called hard examples. The phenomenon causes that AdaBoost assigns too high weights to hard examples. In this research, we introduce the thresh- olds into the weighting rule of AdaBoost in order to prevent weights from being assigned too high value. During learning process, we compare the upper bound of the classiﬁcation error of our method with that of AdaBoost, and we set the thresholds such that the upper bound of our method can be superior to that of AdaBoost. Our method shows better performance than AdaBoost. Keywords: AdaBoost, hard example, weighting rule, errorbound, NadaBoost AMS subject classiﬁcation: 68T05, 68Q32 1. Introduction Boosting is a method for improving the classiﬁcation accuracy of a given learn- ing algorithm by combining hypotheses created by the learning algorithm. Recently, many researchers have carried out research on AdaBoost [3,7,9], the typical boosting algorithm. They have reported that AdaBoost has shown very good performance on many classiﬁcation problems. However, some drawbacks of AdaBoost have also been reported. One of the drawbacks of AdaBoost is its susceptibility to noisy examples. It means that AdaBoost tends to concentrate on noisy examples, and worsens its perfor- mance. This phenomenon is called overﬁtting. AdaBoost usually calls for a given learning algorithm repeatedly. We term this learning algorithm WeakLearn. AdaBoost maintains a set of weights over the training examples. The set of weights is employed as a distribution. Initially, all weights are set equally in advance, but on each round, the weights of incorrectly classiﬁed examples are increased so that the WeakLearn is forced to concentrate on the examples which are difﬁcult to classify correctly. That is, as the weight of a certain example becomes higher, the hypothesis becomes more speciﬁc to the example. However, if the training set includes noisy examples, AdaBoost tends to assign higher weights to noisy examples