Identifying Cancer Disease through Deoxyribonucleic Acid (DNA) Sequential Pattern Mining

HTML  XML Download Download as PDF (Size: 383KB)  PP. 9-23  
DOI: 10.4236/ijis.2017.71002    1,475 Downloads   2,470 Views  Citations

ABSTRACT

This paper aims to propose the sequential pattern discovery method of Deoxyribonucleic Acid (DNA) sequence database in order to identify cancer disease. The DNA which is composed of amino acids of gene P53 is mutated. It effects to change of P53 formation. Sequential pattern discovery is a process of extracting data to generate knowledge about the series of events that has the sequences in a certain frequency so that it creates a pattern. PrefixSpan is to propose method to find a pattern of DNA sequence database. As a result, there are various selected patterns of DNA sequence. The pattem which has high similarity is used as biomarker to identify the breast cancer disease. The performance measure of support value average is 0.8. It means that the frequent sequence pattern is high. Another measure is confidence. All of the confidence values are 1. Then, the last performance measure is lift ratio at average more than 1. It means that the composed sequence items in the pattern has high dependency and relatedness. Futhermore, the selected patterns are applied as biomarker with accuracy as 100%.

Share and Cite:

Muflikhah, L. and Yuliantoro, I. (2017) Identifying Cancer Disease through Deoxyribonucleic Acid (DNA) Sequential Pattern Mining. International Journal of Intelligence Science, 7, 9-23. doi: 10.4236/ijis.2017.71002.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.