Int. J. Data Mining and Bioinformatics, Vol. 16, No. 3, 2016 183
Copyright © 2016 Inderscience Enterprises Ltd.
Pre-processing of microarray gene expression data
for classification using adaptive feature selection
and imputation of non-ignorable missing values
R. Devi Priya*
Department of Information Technology,
Kongu Engineering College,
Erode, Tamil Nadu, India
Email: scrpriya@gmail.com
*Corresponding author
R. Sivaraj
Department of Computer Science and Engineering,
Velalar College of Engineering and Technology,
Erode, Tamil Nadu, India
Email: rsivarajcse@gmail.com
Abstract: Microarray datasets often contain many features and incomplete
values. To address these issues, this paper introduces a method called Genetic
Algorithm-Based Adaptive Feature Selection with Missing value Imputation
(GAFSMI) with two contributions. First, for identifying the noteworthy
features, Genetic Algorithm-Based Adaptive Feature Selection (GAFS) is
proposed. Then for imputing the non-ignorable missing values, Bayesian
Genetic Algorithm (BAGEL) integrating genetic algorithm with Bayesian
principles is introduced. These two pre-processing steps generate the complete
dataset with optimal feature subset to perform classification with better
accuracy. The proposed algorithm is implemented on eight microarray datasets
and it is observed that GAFS selects optimal feature subset with appreciable
classification accuracy than other feature selection techniques. The imputation
accuracy of BAGEL measured is found to be better than other standard
imputation techniques at different missing rates (5% to 40%). Classification
accuracy is improved in all the datasets processed with GAFS and BAGEL.
Keywords: microarray data set; feature selection; missing values; genetic
algorithm; classification.
Reference to this paper should be made as follows: Devi Priya, R. and Sivaraj,
R. (2016) ‘Pre-processing of microarray gene expression data for classification
using adaptive feature selection and imputation of non-ignorable missing
values’, Int. J. Data Mining and Bioinformatics, Vol. 16, No. 3, pp.183–204.
Biographical notes: R. Devi Priya is an Assistant Professor in the Department
of Information Technology, Kongu Engineering College. She has received
her PhD from Anna University, Chennai, in 2013. She has published about
50 papers in national and international conferences and journals. Her research
interests are data warehousing and mining and nature inspired algorithms.