Bayes Machines for Binary Classification DanielHern´andez-Lobato ∗ and Jos´ e Miguel Hern´andez-Lobato Escuela Polit´ ecnica Superior, Universidad Aut´ onoma de Madrid, C/ Francisco Tom´ as y Valiente, 11, Madrid 28049 Spain. Abstract In this work we propose an approach to binary classification based on an exten- sion of Bayes Point Machines. Particularly, we take into account the whole set of hypotheses that are consistent with the data (the so called version space ) and the intrinsic noise in class labeling. We follow a Bayesian approach and compute an approximate posterior distribution for the model parameters, which leads to a pre- dictive distribution over unseen data. The most compelling feature of the proposed model is that it is able to learn the noise present in the data with no additional cost. All the computations are carried out by means of the approximate Bayesian inference algorithm Expectation Propagation. Experimental results indicate that the proposed approach outperforms Support Vector Machines over several of the classification problems studied and is competitive with other Bayesian classification algorithms based on Gaussian Processes. Key words: Kernel Methods, Approximate Inference, Bayesian Methods, Expectation Propagation, Bayes Point Machines, Bayes Machines 1 Introduction Kernel based classifiers have become quite popular over the recent years and, as a result, much research has been focused on them. Some popular kernel classifiers are the Support Vector Machine (SVM), the Bayes Point Machine (BPM), and the Gaussian Process Classifier (GPC). The quite famous, al- though not Bayesian, SVM was devised as a classifier that maximizes the margin. That is, the minimum distance between data points and the class ∗ Corresponding author. Tel: +34-91-497-2260; fax: +34-91-497-2235. Email addresses: daniel.hernandez@uam.es (Daniel Hern´ andez-Lobato), josemiguel.hernandez@uam.es (Jos´ e Miguel Hern´ andez-Lobato). Preprint submitted to Elsevier 22 April 2008