International Journal of Electrical and Computer Engineering (IJECE)
Vol. 8, No. 6, December 2018, pp. 4505~4518
ISSN: 2088-8708, DOI: 10.11591/ijece.v8i6.pp4505-4518 4505
Journal homepage: http://iaescore.com/journals/index.php/IJECE
SCDT: FC-NNC-structured Complex Decision Technique for
Gene Analysis Using Fuzzy Cluster based Nearest Neighbor
Classifier
Sudha V.
1
, Girijamma H. A.
2
1
Department of IS&E, RNS Institute of Technology, India
2
Department of Computer Science & Engineering, RNS Institute of Technology, India
Article Info ABSTRACT
Article history:
Received Feb 12, 2018
Revised Jun 17, 2018
Accepted Jun 20, 2018
In many diseases classification an accurate gene analysis is needed, for
which selection of most informative genes is very important and it require a
technique of decision in complex context of ambiguity. The traditional
methods include for selecting most significant gene includes some of the
statistical analysis namely 2-Sample-T-test (2STT), Entropy, Signal to Noise
Ratio (SNR). This paper evaluates gene selection and classification on the
basis of accurate gene selection using structured complex decision technique
(SCDT) and classifies it using fuzzy cluster based nearest neighborclassifier
(FC-NNC). The effectiveness of the proposed SCDT and FC-NNC is
evaluated for leave one out cross validation metric(LOOCV) along with
sensitivity, specificity, precision and F1-score with four different classifiers
namely 1) Radial Basis Function (RBF), 2) Multi-layer perception(MLP), 3)
Feed Forward(FF) and 4) Support vector machine(SVM) for three different
datasets of DLBCL, Leukemia and Prostate tumor. The proposed SCDT
&FC-NNC exhibits superior result for being considered more accurate
decision mechanism.
Keyword:
Fuzzy classification
Gene analysis
Gene selection
Machine learning
Micro array data
Copyright © 2018 Institute of Advanced Engineering and Science.
All rights reserved.
Corresponding Author:
Sudha V.,
Department of Information Science & Engineering,
RNS Institute of Technology, Bengaluru, India.
Email: sudhavinayakam@gmail.com
1. INTRODUCTION
The accuracy of diagnosis is the basis for the perfect treatment process to be adopted especially in
the case of fatal disease like cancers, leukemia and prostrate tumor etc. Along with the histopathology,
medical radiology and imaging techniques, the micro-array data analysis could be proven quite helpful as
well as rightful if efficient techniques of analysis are evolved [1]. The accuracy of disease classification or
early diagnosis depends upon, how accurately the gene of significance is selected.
The DNA-microarray data analysis is challenging in both aspects of statistically and
computationally as it possesses non-linear noises along with high dimensionality of low sample data [2].
Many efforts towards disease diagnosis particularly cancer, tumor etc, classification have been seen in
literature [3]-[10]. The section 2 describes the insights of related work. Various machine learning approaches
are used for the classification which includes radial basis function (RBF), artificial neural network (ANN),
support vector machine (SVM) etc. by forming the problem as binary classification. The problem of
dimension reduction for searching most significant gene is being formulated as many problem spaces which
includes 1) Mixed integer programming (MIP), 2) Bio-inspired optimization (BIO), 3) Mining association
rules (MAR), and last but not the least 4) Ensemble technique (ET) [8].
The clinically comprehensive method requires handling high dimensional data with veracity and
noises to handle ambiguity during the right gene candidate selection. This paper proposes a mechanism of