Indonesian Journal of Electrical Engineering and Computer Science
Vol. 18, No. 1, April 2020, pp. 343~350
ISSN: 2502-4752, DOI: 10.11591/ijeecs.v18.i1.pp343-350 343
Journal homepage: http://ijeecs.iaescore.com
Functional analysis of cancer gene subtype from co-clustering
and classification
Logenthiran Machap
1
, Afnizanfaizal Abdullah
2
, Zuraini Ali Shah
3
1,2
Synthetic Biology Research Group, Universiti Teknologi Malaysia, Malaysia
3
Artificial Intelligence and Bioinformatics Group, School of Computing, Faculty of Engineering,
Universiti Teknologi Malaysia, Malaysia
Article Info ABSTRACT
Article history:
Received Aug 27, 2019
Revised Sep 16, 2019
Accepted Oct 3, 2019
Cancer is a heterogeneity genetic disease with huge phenotypic alterations
among dissimilar cancers types or even between same cancer types. Recent
expansions of genome-wide profiling technologies offer a chance to explore
molecular changes variations throughout advancement of cancer. Therefore,
various statistical and machine learning algorithms have been designed and
developed for the handling and interpretation of high-throughput microarray
molecular data. Discovery of molecular subtypes studies have permitted the
cancer to be allocated into similar groups that are deliberated to port similar
molecular and clinical characteristics. Thus, the main objective of this research
is to discover cancer gene subtypes and classify genes to obtain higher
accuracy. In particular improved co-clustering algorithm used to discover
cancer subtypes. And then supervised infinite feature selection gene selection
method was combined with multi class SVM for classification of selected
genes and further biological analysis. The analysis on breast cancer and
glioblastoma multiforme evidences that top genes involved in cancer and the
pathways present in both cancer top genes. The functional analysis is useful in
medical and pharmaceutical field for cancer diagnosis and prognosis.
Keywords:
Biological analysis
Cancer subtypes
Classification
Co-clustering
Microarray
Copyright © 2020 Institute of Advanced Engineering and Science.
All rights reserved.
Corresponding Author:
Logenthiran Machap,
School of Computing, Faculty of Engineering,
Universiti Teknologi Malaysia, 81310 Skudai Johor Malaysia.
Email: logmac_87@yahoo.com
1. INTRODUCTION
Abnormalities of cancer genome can be observed through basic researches which have been used to
categorize patients with respect to enhance their clinical decision making and implement more efficient
treatments. Even though this types of categorization have enhanced the efficiency of treatment of various
cancers, but the heterogeneity among the populations still remains as a main challenge. The advancement of
DNA microarray technology has permitted an extensive understanding of genes especially in oncology field
for start, diagnosis and prognosis of cancers. These various diagnostics are useful for different types of cancer,
which lead to individual treatment plans and accurate clinical outcomes estimation [1, 2].
As the initial stage in organizing and investigating high-throughput gene expression datasets is
through artificial intelligence in deep machine learning approach by grouping them together (cluster) according
to similar biological features (gene) or conditions (samples) conferred on some similarity measures [3-5].
Meanwhile for both features and conditions are typically inadequate with prior knowledge, the clustering
process is conducted as an unsupervised process via grouping features and conditions [6]. The conventional
clustering is not said to be an ideal method for complicated and heterogeneous cancers. This is because, there
are only certain genes in a subset of samples are expressed as a cancer genes in cellular processes among the
similar clinical types of cancer in a specific tissue. Hence, it has been found a limitation that a single gene
might play role in regulating and participating in numerous clusters and pathways of different conditions [7].