Bootstrap Feature Selection in Support Vector Machines for Ventricular Fibrillation Detection F. Alonso Atienza 1 , J.L. Rojo ´ Alvarez 1 , G. Camps i Valls 2 , A. Rosado Mu˜ noz 2 , and A. Garc´ ıa Alberola 3 * 1- University Carlos III of Madrid - Dept. of Signal Theory and Communications Av. Universidad 30, 28911, Legan´ es, Madrid - Spain 2- University of Valencia - Dept. of Electrical Engineering (GPDS) Doctor Moliner 50, 46100, Burjassot, Valencia - Spain 3- University Hospital Virgen de la Arrixaca - Lab. of Electrophysiology Ct. Madrid-Cartagena s/n, 30120, El Palmar, Murcia - Spain Abstract. Support Vector Machines (SVM) for classiﬁcation are be- ing paid special attention in a number of practical applications. When using nonlinear Mercer kernels, the mapping of the input space to a high- dimensional feature space makes the input feature selection a diﬃcult task to be addressed. In this paper, we propose the use of nonparametric boot- strap resampling technique to provide with a statistical, distribution inde- pendent, criterion for input space feature selection. The conﬁdence interval of the diﬀerence of error probability between the complete input space and a reduced-in-one-variable input space, is estimated via bootstrap resam- pling. Hence, a backward variable elimination procedure can be stated, by removing one variable at each step according to its associated conﬁ- dence interval. A practical example application to early stage detection of cardiac Ventricular Fibrillation (VF) is presented. Basing on a previ- ous nonlinear analysis based on temporal and spectral VF parameters, we use the SVM with Gaussian kernel and bootstrap resampling to provide with the minimum input space feature set that still holds the classiﬁcation performance of the complete data. The use of bootstrap resampling is a powerful input feature selection procedure for SVM classiﬁers. 1 Introduction Support Vector Machines (SVM) are eﬃcient learning schemes [1], which have been paid special attention during the last years. The SVM classiﬁcation algo- rithm has shown an excellent performance in a number of practical applications [2], in terms of minimal classiﬁcation error probability. In particular, SVM are robust when working with high-dimensional input spaces, such as images or gene expressions [3]. In some applications, not only the best classiﬁcation is required, but also the quantiﬁcation of the relative relevance of each of the input space features, as well as the determination of the most reduced set of variables with non-redundant information, is needed. In classical statistics, this twofold task is addressed by linear (and the nonlinear versions) discriminant analysis [4]. * This work has been partially supported by research projects from Guidant Spain and from “Comunidad de Madrid” (GR/SAL/0471/2004). ESANN'2006 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), 26-28 April 2006, d-side publi., ISBN 2-930307-06-4. 233