Feature Space Reduction for Graph-Based Image Classification Niusvel Acosta-Mendoza 1,2,⋆ , Andr´ es Gago-Alonso 1 , Jes´ us Ariel Carrasco-Ochoa 2 , Jos´ e Francisco Mart´ ınez-Trinidad 2 , and Jos´ e E. Medina-Pagola 1 1 Advanced Technologies Application Center (CENATAV), 7a ♯ 21406 e/ 214 and 216, Siboney, Playa, CP: 12200, Havana, Cuba {nacosta,agago,jmedina}@cenatav.co.cu 2 National Institute of Astrophysics, Optics and Electronics (INAOE), Luis Enrique Erro No. 1, Sta. Mar´ ıa Tonantzintla, Puebla, CP: 72840, Mexico {ariel,fmartine}@inaoep.mx Abstract. Feature selection is an essential preprocessing step for classi- fiers with high dimensional training sets. In pattern recognition, feature selection improves the performance of classification by reducing the fea- ture space but preserving the classification capabilities of the original feature space. Image classification using frequent approximate subgraph mining (FASM) is an example where the benefits of features selections are needed. This is due using frequent approximate subgraphs (FAS) leads to high dimensional representations. In this paper, we explore the use of feature selection algorithms in order to reduce the representation of an image collection represented through FASs. In our results we re- port a dimensionality reduction of over 50% of the original features and we get similar classification results than those reported by using all the features. Keywords: Approximate graph mining, approximate graph matching, feature selection, graph-based classification. 1 Introduction Finding a discriminative subset of features is essential when there are high di- mensional representations. Feature selection algorithms allow improving clas- sifiers performance by reducing the feature space and keeping discrimination capabilities of the original representation. The main idea of these algorithms is to calculate a subset of the input features by removing those with little or no predictive information for classification [3–6, 10, 15, 17, 19]. These algorithms can be arranged into three main groups: wrapper algorithms [3], filter algo- rithms [6, 10, 19] and embedded algorithms [5, 17]. Wrapper ones use a classifier to evaluate feature subsets. The advantage of these algorithms is the interac- tion between the feature subset search and the classifier, but it is an expensive ⋆ Corresponding author. J. Ruiz-Shulcloper and G. Sanniti di Baja (Eds.): CIARP 2013, Part I, LNCS 8258, pp. 246–253, 2013. c Springer-Verlag Berlin Heidelberg 2013