TITLE:
Challenges Analyzing RNA-Seq Gene Expression Data
AUTHORS:
Liliana López-Kleine, Cristian González-Prieto
KEYWORDS:
RNA-Seq Analysis, Count Data, Preprocessing, Differential Expression, Gene Co-Expression Network
JOURNAL NAME:
Open Journal of Statistics,
Vol.6 No.4,
August
19,
2016
ABSTRACT: The analysis of messenger Ribonucleic acid
obtained through sequencing techniques (RNA-se- quencing) data is very
challenging. Once technical difficulties have been sorted, an important choice
has to be made during pre-processing: Two different paths can be chosen:
Transform RNA- sequencing count data to a continuous variable or continue to
work with count data. For each data type, analysis tools have been developed
and seem appropriate at first sight, but a deeper analysis of data distribution
and structure, are a discussion worth. In this review, open questions regarding
RNA-sequencing data nature are discussed and highlighted, indicating important
future research topics in statistics that should be addressed for a better
analysis of already available and new appearing gene expression data. Moreover,
a comparative analysis of RNAseq count and transformed data is presented. This
comparison indicates that transforming RNA-seq count data seems appropriate, at
least for differential expression detection.