Vol.8 (2018) No. 4-2 ISSN: 2088-5334 Ant Colony Optimization Based Subset Feature Selection in Speech Processing: Constructing Graphs with Degree Sequences R. Rajesvary Rajoo #* , Rosalina Abdul Salam # # Faculty of Science and Technology, Universiti Sains Islam Malaysia, Bandar Baru Nilai, Negeri Sembilan, Malaysia E-mail: rajes_e@nilai.edu.my, rosalina@usim.edu.my *Faculty of Engineering and Technology, Nilai University, No 1, Persiaran Universiti, Putra Nilai, 71800 Nilai, Negeri Sembilan, Malaysia Abstract— Feature selection or the process of selecting the most discriminating feature subset is an essential practice in speech processing that significantly affects the performance of classification. However, the volume of features that presents in speech processing makes the feature selection perplexing. Moreover, determining the best feature subset is a NP-hard problem (2 n ). Thus, a good searching strategy is required to avoid evaluating large number of combinations in the whole feature subsets. As a result, in recent years, many heuristic based search algorithms are developed to address this NP-hard problem. One of the several meta heuristic algorithms that is applied in many application domains to solve feature selection problem is Ant Colony Optimization (ACO) based algorithms. ACO based algorithms are nature-inspired from the foraging behavior of actual ants. The success of an ACO based feature selection algorithm depends on the choice of the construction graph with respect to runtime behavior. While most ACO based feature selection algorithms use fully connected graphs, this paper proposes ACO based algorithm that uses graphs with prescribed degree sequences. In this method, the degree of the graph representing the search space will be predicted and the construction graph that satisfies the predicted degree will be generated. This research direction on graph representation for ACO algorithms may offer possibilities to reduce computation complexity from O(n 2 ) to O(nm) in which m is the number of edges. This paper outlines some popular optimization based feature selection algorithms in the field of speech processing applications and overviewed ACO algorithm and its main variants. In addition to that, ACO based feature selection is explained and its application in various speech processing tasks is reviewed. Finally, a degree based graph construction for ACO algorithms is proposed. Keywords— feature selection; speech processing; heuristic algorithms; ant colony optimization; degree sequences. I. INTRODUCTION Most speech processing tasks employ Machine Learning paradigm in which a classifier must undergo a proper learning process. The performance of classification in machine learning is strongly associated with salient features. Therefore, selecting salient features from the feature vectors, which consist of large set of feature values, is very crucial. Extracting salient features from the given set of features will reduce the dimensionality of the data set and consequently raise the accuracy and runtime performance of the classifiers [1]. Feature selection process is aimed to generate a reduced set of most discriminative features from the existing feature set by eliminating redundant and irrelevant features. The feature selection process is comprised of two main parts; the searching strategy that explores the search space that select a subset of features and a measurement procedure that evaluates the quality of these subsets of features and makes the best subset to be selected [2]. Determining the most appropriate feature subset is a NP- hard problem (2 n ) where n denotes the number of features. Thus, a good searching strategy is required to avoid evaluating large number of combinations in the whole feature subsets. As a result, many searching strategies have been proposed in the literature such as Sequential Backward Selection, Sequential Forward Selection, Bidirectional Selection and Complete Search. These search processes are mainly categorized into two main approaches; filter and wrapper methods. Filter-based approaches categorize features or subset of features independently of the classifier. However, wrapper approaches use a classifier to evaluate the subset of features. Some researches use embedded method to take advantages of both approaches [3, 4, 5]. Most of the searching techniques mentioned above use local search instead of global search throughout the entire process, and therefore, it is difficult to achieve near optimal to optimal solutions. Hence, in recent years there is a lot of drive from computational intelligent community for 1728