International Journal of Intelligence Science, 2017, 7, 9-23
http://www.scirp.org/journal/ijis
ISSN Online: 2163-0356
ISSN Print: 2163-0283
DOI: 10.4236/ijis.2017.71002 January 13, 2017
Identifying Cancer Disease through
Deoxyribonucleic Acid (DNA) Sequential
Pattern Mining
Lailil Muflikhah, Ilham Yuliantoro
Computer Science Department, Brawijaya University, Malang, Indonesia
Abstract
This paper aims to propose the sequential pattern discovery method of
Deoxyribonucleic Acid (DNA) sequence database in order to identify cancer
disease. The DNA which is composed of amino acids of gene P53 is mutated.
It effects to change of P53 formation. Sequential pattern discovery is a process
of extracting data to generate knowledge about the series of events that has the
sequences in a certain frequency so that it creates a pattern. PrefixSpan is to
propose method to find a pattern of DNA sequence database. As a result,
there are various selected patterns of DNA sequence. The pattem which has
high similarity is used as biomarker to identify the breast cancer disease. The
performance measure of support value average is 0.8. It means that the
frequent sequence pattern is high. Another measure is confidence. All of the
confidence values are 1. Then, the last performance measure is lift ratio at
average more than 1. It means that the composed sequence items in the
pattern has high dependency and relatedness. Futhermore, the selected patterns
are applied as biomarker with accuracy as 100%.
Keywords
Sequential Pattern, Breast Cancer, DNA, PrefixSpan, Lift Ratio
1. Introduction
Cancer is classified as malignant and deadly disease. The cancer disease is effec-
tive to uncontrolled growth of cells and gene mutations, i.e. gene of P53. This
disease changes P53 protein sequence [1]. This protein consists of a combination
of 20 amino acids which are synthesized by ribosomes and are performed based
on the genetic code of the Deoxyribonucleic Acid (DNA). If the DNA is mu-
tated, then the protein composition will be incorrect. Continuously, it is effective
How to cite this paper: Muflikhah, L. and
Yuliantoro, I. (2017) Identifying Cancer
Disease through Deoxyribonucleic Acid
(DNA) Sequential Pattern Mining. Interna-
tional Journal of Intelligence Science, 7,
9-23.
http://dx.doi.org/10.4236/ijis.2017.71002
Received: October 1, 2016
Accepted: January 10, 2017
Published: January 13, 2017
Copyright © 2017 by authors and
Scientific Research Publishing Inc.
This work is licensed under the Creative
Commons Attribution International
License (CC BY 4.0).
http://creativecommons.org/licenses/by/4.0/
Open Access