Boosting Bayesian MAP Classiﬁcation Paolo Piro CNRS/University of Nice-Sophia Antipolis piro@i3s.unice.fr Richard Nock CEREGMIA, University of Antilles-Guyane rnock@martinique.univ-ag.fr Frank Nielsen Ecole Polytechnique, France / Sony CSL nielsen@lix.polytechnique.fr Michel Barlaud CNRS/University of Nice-Sophia Antipolis barlaud@i3s.unice.fr Abstract In this paper we redeﬁne and generalize the classic k-nearest neighbors (k-NN) voting rule in a Bayesian maximum-a-posteriori (MAP) framework. Therefore, annotated examples are used for estimating pointwise class probabilities in the feature space, thus giving rise to a new instance-based classiﬁcation rule. Namely, we propose to “boost” the classic k-NN rule by inducing a strong classiﬁer from a combination of sparse train- ing data, called “prototypes”. In order to learn these prototypes, our MAPBOOST algorithm globally min- imizes a multiclass exponential risk deﬁned over the training data, which depends on the class probabilities estimated at sample points themselves. We tested our method for image categorization on three benchmark databases. Experimental results show that MAPBOOST signiﬁcantly outperforms classic k- NN (up to 8%). Interestingly, due to the supervised selection of sparse prototypes and the multiclass clas- siﬁcation framework, the accuracy improvement is ob- tained with a considerable computational cost reduc- tion. 1. Introduction We address the task of image categorization, which aims at classifying images into a predeﬁned set of cate- gories. Several techniques have been proposed to solve this problem automatically, among which instance- based methods, like k-nearest neighbors (k-NN) clas- siﬁcation, have shown very good performances [1]. In particular, much research effort has been devoted to im- prove the statistical and computational properties of the classic k-NN vote, which relies on labeled neighbors to predict the class of unlabeled data [11]. Such meth- ods can be viewed as primers to improve the (continu- ous) estimation of the class membership probabilities. Moreover, a Bayesian reassessment of the problem has been recently proposed as a motivation for the formal transposition of boosting to k-NN classiﬁcation [5]. We generalize the k-NN rule in a supervised Bayesian framework, where annotated data (sam- ple points) are used for non-parametric maximum-a- posteriori (MAP) estimation [2]. Namely, our main contribution is redeﬁning the classic voting rule as a strong classiﬁer that linearly combines predictions from sample points in a boosting framework. For this pur- pose, our boosting algorithm minimizes a multiclass risk function over training data, thus redeﬁning the UNN approach of [9] directly in a multiclass frame- work. In the following sections, we ﬁrst deﬁne the boosting problem for MAP classiﬁers and describe our leverag- ing algorithm (Sec. 2.1–2.2). Then, we provide the so- lution when using kernel density estimators (Sec. 2.4), thus enlightening the link to classic k-NN classiﬁcation. Finally, we present and discuss some experimental re- sults on categorization of natural images (Sec. 3). 2. Method 2.1 (Leveraged) MAP classiﬁcation We tackle the classiﬁcation problem directly in a multiclass framework, i.e., unlike [9], we do not reduce it to multiple two-class problems. We suppose given a training set S of m annotated examples (x i , y i ), where x i is the image descriptor and y i the class vector that speciﬁes the category membership. In particular, the sign of component y ic gives the positive/negative mem- bership of the example to class c (c =1, ..., C). Inspired by the multiclass boosting analysis of [12], we constrain 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.167 665 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.167 665 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.167 661 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.167 661 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.167 661