1st Conference on Swarm Intelligence and Evolutionary Computation (CSIEC2016), Higher Education Complex of Bam, Iran, 2016 978-1-4673-8737-8/16/$31.00 ©2016 IEEE A hybrid method for dimensionality reduction in microarray data based on advanced binary ant colony algorithm Amirreza Rouhi, Hossein Nezamabadi-pour Department of Electrical Engineering, Shahid Bahonar University of Kerman, Kerman, Iran amirreza.rouhi1@gmail.com, nezam@mail.uk.ac.ir Abstract— The advent and proliferation of high-dimensional data have drawn the attention of researchers toward the subject of feature selection in machine learning and data mining. Increased number of irrelevant and redundant features has decreased the accuracy of classifiers, increased their computational cost and reinforced the “curse of dimensionality”. This paper proposes a hybrid method, where first a number of filter methods reduce the dimensionality of features and then the advanced binary ant colony (ABACOH) meta-heuristic algorithm runs on the set of reduced features to select the most effective feature subset. Performance of the proposed method is measured by the applying on the five well-known high-dimensional microarray datasets and the results are compared with those of several state-of-the-art methods. The obtained results confirm the effectiveness of the proposed algorithm. Keywords: feature selection, high-dimensional data, hybrid methods, meta-heuristic methods, filter methods, ensemble methods I. INTRODUCTION Feature selection is one of the fundamental concepts of machine learning, and plays a particularly important role in the classification processes, where irrelevant and redundant features can undermine the efficiency, effectiveness and speed of classifiers. The advent of high-dimensional data, such as microarray datasets containing hundreds or thousands of features, has made feature selection far more difficult. Processing the entirety of the features to separate and classify the data can become very costly and time-consuming, and this is where dimensionality reduction can provide viable strategies. A dimensionality reduction after which the selected features are of desirable accuracy may prove to be much valuable, because it can eliminate the irrelevant and redundant features, thus enhance the speed of training phase and the rate of correct classification. Feature selection methods introduced to date can be generally classified into four categories: 1) filter methods, 2) wrapper methods, 3) hybrid methods, and 4) embedded methods. Filter methods act independently of the learning algorithm. They use the inherent characteristics of the data to form a ranking and then select the highest-ranking features. These methods have a relatively high speed, so they can be effectively utilized for high-dimensional data. Filter methods can be divided into two categories: univariate and multivariate. Univariate methods use a measure to evaluate the relationship of a single feature, and ignore the possible association between the features, and this sometimes causes them to become inadequate. Multivariate methods consider the dependencies between features but they are computationally more expensive than univariate methods. The most widely-known univariate filter methods include: Information Gain (IG) [1, 2], Fisher score (F- score) [3], Term variance (TV) [4], and Laplacian score (LS) [5]. Filter methods of multivariate type include but not limited to: Correlation based feature selection (CFS) [6], Interact [7], Relevance-redundancy feature selection (RRFS) [8], Random subspace method (RSM) [9], Mutual correlation (MC) [10], and Fast correlation-based filter (FCBF). Wrapper methods use the rate of classification error as a metric for measuring the quality of subsets of features; as a result, they can provide highly accurate results. However, since these classifiers measure the quality of every subset, they have a very low speed and a very high computational complexity, which this effect restricts their applicability for high- dimensional data [12]. None of the filter or wrapper methods are guaranteed to find the best solution and each has their own advantages and defects; however they can be used as complementary means of approach, via techniques called hybrid methods. In other words, hybrid methods consist of two levels. In the first level, a filter method reduces the dimensionality of data; then in the second level, a wrapper method selects the best subset of features. This process has a reduced risk of eliminating desirable features as compared to filter methods. In [13], authors have presented a three-phase hybrid feature selection method specifically designed for high- dimensional data. This method first uses F-score and IG techniques to eliminate the irrelevant and redundant features, leading to a reduced dimensionality; then it processes the results of each technique by AND and XOR operators to produce two distinct subset of features. In the last phase, it uses a wrapper method and a learning algorithm to select the desired features. In [14], authors have introduced a technique called R-m-GA that is a hybrid method composed of ReliefF, Mrmr and genetic algorithm. 70