Sequence Motif-Based One-Class Classifiers Can Achieve Comparable Accuracy to Two-Class Learners for Plant microRNA Detection

HTML  XML Download Download as PDF (Size: 644KB)  PP. 684-694  
DOI: 10.4236/jbise.2015.810065    5,183 Downloads   6,159 Views  Citations
Author(s)

ABSTRACT

microRNAs (miRNAs) are short nucleotide sequences expressed by a genome that are involved in post transcriptional modulation of gene expression. Since miRNAs need to be co-expressed with their target mRNA to observe an effect and since miRNAs and target interactions can be cooperative, it is currently not possible to develop a comprehensive experimental atlas of miRNAs and their targets. To overcome this limitation, machine learning has been applied to miRNA detection. In general binary learning (two-class) approaches are applied to miRNA discovery. These learners consider both positive (miRNA) and negative (non-miRNA) examples during the training process. One-class classifiers, on the other hand, use only the information for the target class (miRNA). The one-class approach in machine learning is gradually receiving more attention particularly for solving problems where the negative class is not well defined. This is especially true for miRNAs where the positive class can be experimentally confirmed relatively easy, but where it is not currently possible to call any part of a genome a non-miRNA. To do that, it should be co-expressed with all other possible transcripts of the genome, which currently is a futile endeavor. For machine learning, miRNAs need to be transformed into a feature vector and some currently used features like minimum free energy vary widely in the case of plant miRNAs. In this study it was our aim to analyze different methods applying one-class approaches and the effectiveness of motif-based features for prediction of plant miRNA genes. We show that the application of these one-class classifiers is promising and useful for this kind of problem which relies only on sequence- based features such as k-mers and motifs comparing to the results from two-class classification. In some cases the results of one-class are, to our surprise, more accurate than results from two-class classifiers.

Share and Cite:

Yousef, M. , Allmer, J. and Khalifa, W. (2015) Sequence Motif-Based One-Class Classifiers Can Achieve Comparable Accuracy to Two-Class Learners for Plant microRNA Detection. Journal of Biomedical Science and Engineering, 8, 684-694. doi: 10.4236/jbise.2015.810065.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.