Boosting Bayesian MAP Classification Paolo Piro CNRS/University of Nice-Sophia Antipolis piro@i3s.unice.fr Richard Nock CEREGMIA, University of Antilles-Guyane rnock@martinique.univ-ag.fr Frank Nielsen Ecole Polytechnique, France / Sony CSL nielsen@lix.polytechnique.fr Michel Barlaud CNRS/University of Nice-Sophia Antipolis barlaud@i3s.unice.fr Abstract In this paper we redefine and generalize the classic k-nearest neighbors (k-NN) voting rule in a Bayesian maximum-a-posteriori (MAP) framework. Therefore, annotated examples are used for estimating pointwise class probabilities in the feature space, thus giving rise to a new instance-based classification rule. Namely, we propose to “boost” the classic k-NN rule by inducing a strong classifier from a combination of sparse train- ing data, called “prototypes”. In order to learn these prototypes, our MAPBOOST algorithm globally min- imizes a multiclass exponential risk defined over the training data, which depends on the class probabilities estimated at sample points themselves. We tested our method for image categorization on three benchmark databases. Experimental results show that MAPBOOST significantly outperforms classic k- NN (up to 8%). Interestingly, due to the supervised selection of sparse prototypes and the multiclass clas- sification framework, the accuracy improvement is ob- tained with a considerable computational cost reduc- tion. 1. Introduction We address the task of image categorization, which aims at classifying images into a predefined set of cate- gories. Several techniques have been proposed to solve this problem automatically, among which instance- based methods, like k-nearest neighbors (k-NN) clas- sification, have shown very good performances [1]. In particular, much research effort has been devoted to im- prove the statistical and computational properties of the classic k-NN vote, which relies on labeled neighbors to predict the class of unlabeled data [11]. Such meth- ods can be viewed as primers to improve the (continu- ous) estimation of the class membership probabilities. Moreover, a Bayesian reassessment of the problem has been recently proposed as a motivation for the formal transposition of boosting to k-NN classification [5]. We generalize the k-NN rule in a supervised Bayesian framework, where annotated data (sam- ple points) are used for non-parametric maximum-a- posteriori (MAP) estimation [2]. Namely, our main contribution is redefining the classic voting rule as a strong classifier that linearly combines predictions from sample points in a boosting framework. For this pur- pose, our boosting algorithm minimizes a multiclass risk function over training data, thus redefining the UNN approach of [9] directly in a multiclass frame- work. In the following sections, we first define the boosting problem for MAP classifiers and describe our leverag- ing algorithm (Sec. 2.1–2.2). Then, we provide the so- lution when using kernel density estimators (Sec. 2.4), thus enlightening the link to classic k-NN classification. Finally, we present and discuss some experimental re- sults on categorization of natural images (Sec. 3). 2. Method 2.1 (Leveraged) MAP classification We tackle the classification problem directly in a multiclass framework, i.e., unlike [9], we do not reduce it to multiple two-class problems. We suppose given a training set S of m annotated examples (x i , y i ), where x i is the image descriptor and y i the class vector that specifies the category membership. In particular, the sign of component y ic gives the positive/negative mem- bership of the example to class c (c =1, ..., C). Inspired by the multiclass boosting analysis of [12], we constrain 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.167 665 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.167 665 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.167 661 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.167 661 2010 International Conference on Pattern Recognition 1051-4651/10 $26.00 © 2010 IEEE DOI 10.1109/ICPR.2010.167 661