TITLE:
Data Analysis of Multiplex Sequencing at SOLiD Platform: A Probabilistic Approach to Characterization and Reliability Increase
AUTHORS:
Fábio Manoel França Lobato, Carlos Diego Damasceno, Daniela Soares Leite, Ândrea Kelly Ribeiro-dos-Santos, Sylvain Darnet, Carlos Renato Francês, Nandamudi Lankalapalli Vijaykumar, Ádamo Lima de Santana
KEYWORDS:
Probabilistic Modeling, Health Informatics, SOLiD Barcoding System, Statistical Analysis, Multiplex Sequencing
JOURNAL NAME:
American Journal of Molecular Biology,
Vol.8 No.1,
December
22,
2017
ABSTRACT: New sequencing technologies such as Illumina/Solexa, SOLiD/ABI, and 454/Roche, revolutionized the biological researches. In this context, the SOLiD platform has a particular sequencing type, known as multiplex run, which enables the sequencing of several samples in a single run. It implies in cost reduction and simplifies the analysis of related samples. Meanwhile, this sequencing type requires an additional filtering step to ensure the reliability of the results. Thus, we propose in this paper a probabilistic model which considers the intrinsic characteristics of each sequencing to characterize multiplex runs and filter low-quality data, increasing the data analysis reliability of multiplex sequencing performed on SOLiD. The results show that the proposed model proves to be satisfactory due to: 1) identification of faults in the sequencing process; 2) adaptation and development of new protocols for sample preparation; 3) the assignment of a degree of confidence to the data generated; and 4) guiding a filtering process, without discarding useful sequences in an arbitrary manner.