Feature Space Reduction for Graph-Based Image Classiﬁcation Niusvel Acosta-Mendoza 1,2,⋆ , Andr´ es Gago-Alonso 1 , Jes´ us Ariel Carrasco-Ochoa 2 , Jos´ e Francisco Mart´ ınez-Trinidad 2 , and Jos´ e E. Medina-Pagola 1 1 Advanced Technologies Application Center (CENATAV), 7a ♯ 21406 e/ 214 and 216, Siboney, Playa, CP: 12200, Havana, Cuba {nacosta,agago,jmedina}@cenatav.co.cu 2 National Institute of Astrophysics, Optics and Electronics (INAOE), Luis Enrique Erro No. 1, Sta. Mar´ ıa Tonantzintla, Puebla, CP: 72840, Mexico {ariel,fmartine}@inaoep.mx Abstract. Feature selection is an essential preprocessing step for classi- ﬁers with high dimensional training sets. In pattern recognition, feature selection improves the performance of classiﬁcation by reducing the fea- ture space but preserving the classiﬁcation capabilities of the original feature space. Image classiﬁcation using frequent approximate subgraph mining (FASM) is an example where the beneﬁts of features selections are needed. This is due using frequent approximate subgraphs (FAS) leads to high dimensional representations. In this paper, we explore the use of feature selection algorithms in order to reduce the representation of an image collection represented through FASs. In our results we re- port a dimensionality reduction of over 50% of the original features and we get similar classiﬁcation results than those reported by using all the features. Keywords: Approximate graph mining, approximate graph matching, feature selection, graph-based classiﬁcation. 1 Introduction Finding a discriminative subset of features is essential when there are high di- mensional representations. Feature selection algorithms allow improving clas- siﬁers performance by reducing the feature space and keeping discrimination capabilities of the original representation. The main idea of these algorithms is to calculate a subset of the input features by removing those with little or no predictive information for classiﬁcation [3–6, 10, 15, 17, 19]. These algorithms can be arranged into three main groups: wrapper algorithms [3], ﬁlter algo- rithms [6, 10, 19] and embedded algorithms [5, 17]. Wrapper ones use a classiﬁer to evaluate feature subsets. The advantage of these algorithms is the interac- tion between the feature subset search and the classiﬁer, but it is an expensive ⋆ Corresponding author. J. Ruiz-Shulcloper and G. Sanniti di Baja (Eds.): CIARP 2013, Part I, LNCS 8258, pp. 246–253, 2013. c  Springer-Verlag Berlin Heidelberg 2013