Semantic Image Classification with Hierarchical Feature Subset Selection Yuli Gao Dept of Computer Science UNC-Charlotte Charlotte, NC 28223, USA ygao@uncc.edu Jianping Fan Dept of Computer Science UNC-Charlotte Charlotte, NC 28223, USA jfan@uncc.edu ABSTRACT High-dimensional visual features for image content charac- terization enables effective image classification. However, training accurate image classifiers in high-dimensional fea- ture space suffers from the problem of curse of dimensional- ity and thus requires a large number of labeled images. To achieve accurate classifier training in high-dimensional fea- ture space, we propose a hierarchical feature subset selection algorithm for semantic image classification, where the fea- ture subset selection procedure is seamlessly integrated with the underlying classifier training procedure in a single algo- rithm. First, our hierarchical feature subset selection frame- work partitions the high-dimensional feature space into mul- tiple homogeneous feature subspaces and forms a two-level feature hierarchy. Second, weak image classifiers are trained for each homogeneous feature subspace at the lower level of the feature hierarchy, where the traditional feature subset selection techniques such as principal component analysis (PCA) can be used for dimension reduction. Finally, these weak classifiers are boosted to determine an optimal image classifier and the higher-level feature subset selection is real- ized by selecting the most effective weak classifiers and their corresponding homogeneous feature subsets. Our experi- ments on a specific domain of natural images have obtained very positive results. Categories and Subject Descriptors I.4.7 [Image Processing and Computer Vision]: Fea- ture Measurement— feature representation ; I.5 [Artificial Intelligence]: Learning— concept learning General Terms Algorithms, Experimentation Keywords Feature selection, classifier training, semantic image classi- fication Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MIR’05, November 10–11, 2005, Singapore. Copyright 2005 ACM 1-59593-244-5/05/0011 ...$5.00. 1. INTRODUCTION As high-resolution digital cameras become more afford- able and widespread, personal collections of digital images are growing exponentially. Thus, semantic image classifica- tion becomes increasingly important and necessary to sup- port automatic image annotation and semantic image re- trieval via keywords [22, 20, 25, 10, 34, 1]. However, the per- formance of image classifiers largely depends on two inter- related issues: (1) The quality of features [19] that are used for image content representation; (2) The effectiveness of classifier training algorithm. Many image classification systems have been proposed in the literatures [27, 32, 2, 29, 26, 19], and they can be gen- erally classified into two categories based on the underlying framework for image content representation [4, 35, 28, 18]: (a) The first category segments the image into some mean- ingful components and uses them as semantic elements to characterize image content [3, 21]. For example, Carson et al. proposed a blob-based image representation which cal- culates image similarities based on the visual similarities of the image blobs [8]. (b) The second category takes an image as a whole visual appearance and characterize image con- tents by using image-based global visual features [11, 23]. A well-known example is the system developed by Torralba and Oliva which uses discriminant structural templates to represent the global visual properties of natural scene im- ages [31]. Despite the different natures of the underlying image con- tent representation framework, most existing semantic im- age classification techniques rely on high-dimensional visual features. Ideally, using more visual features can enhance the classifier’s ability in identifying different semantic im- age concepts and thus result in higher classification accu- racy. However, learning the image classifier in such high- dimensional feature space requires a large number of labeled samples that generally increases exponentially as the fea- ture dimension increases [17]. Thus, automatically selecting the low-dimensional feature subset with high discrimination power and training the image classifier in such relatively low-dimensional feature subset are one promising solution to address the problem of the curse of dimensionality. To select the optimal feature subset for classifier training, many algorithms have been proposed which can be gener- ally classified into two categories: filter and wrapper.A filter algorithm separates the procedures for feature subset selec- tion and classifier training by merely calculating the ranking information for each feature dimension based on its corre- lation score with the prediction variable. A wrapper algo- 135