International Journal of Intelligence Science, 2017, 7, 9-23 http://www.scirp.org/journal/ijis ISSN Online: 2163-0356 ISSN Print: 2163-0283 DOI: 10.4236/ijis.2017.71002 January 13, 2017 Identifying Cancer Disease through Deoxyribonucleic Acid (DNA) Sequential Pattern Mining Lailil Muflikhah, Ilham Yuliantoro Computer Science Department, Brawijaya University, Malang, Indonesia Abstract This paper aims to propose the sequential pattern discovery method of Deoxyribonucleic Acid (DNA) sequence database in order to identify cancer disease. The DNA which is composed of amino acids of gene P53 is mutated. It effects to change of P53 formation. Sequential pattern discovery is a process of extracting data to generate knowledge about the series of events that has the sequences in a certain frequency so that it creates a pattern. PrefixSpan is to propose method to find a pattern of DNA sequence database. As a result, there are various selected patterns of DNA sequence. The pattem which has high similarity is used as biomarker to identify the breast cancer disease. The performance measure of support value average is 0.8. It means that the frequent sequence pattern is high. Another measure is confidence. All of the confidence values are 1. Then, the last performance measure is lift ratio at average more than 1. It means that the composed sequence items in the pattern has high dependency and relatedness. Futhermore, the selected patterns are applied as biomarker with accuracy as 100%. Keywords Sequential Pattern, Breast Cancer, DNA, PrefixSpan, Lift Ratio 1. Introduction Cancer is classified as malignant and deadly disease. The cancer disease is effec- tive to uncontrolled growth of cells and gene mutations, i.e. gene of P53. This disease changes P53 protein sequence [1]. This protein consists of a combination of 20 amino acids which are synthesized by ribosomes and are performed based on the genetic code of the Deoxyribonucleic Acid (DNA). If the DNA is mu- tated, then the protein composition will be incorrect. Continuously, it is effective How to cite this paper: Muflikhah, L. and Yuliantoro, I. (2017) Identifying Cancer Disease through Deoxyribonucleic Acid (DNA) Sequential Pattern Mining. Interna- tional Journal of Intelligence Science, 7, 9-23. http://dx.doi.org/10.4236/ijis.2017.71002 Received: October 1, 2016 Accepted: January 10, 2017 Published: January 13, 2017 Copyright © 2017 by authors and Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International License (CC BY 4.0). http://creativecommons.org/licenses/by/4.0/ Open Access