Identification of Global Gene Expression Shifts Using Microarray Data from Different Biological Conditions


Gene expression data have been very useful during the past two decades for the detection of differentially expressed genes when two (or more) biological conditions are compared. Studies seeking for differentially expressed genes are based on testing gene by gene for a mean differential expression between two conditions. Nevertheless, the global shift in gene expression when taking into account all genes present on a microarray experiment, has not yet been investigated and could provide different information on genes that could be affected by the condition under research. Such a global approach would help identifying a gene expression threshold, characteristic of a certain condition and therefore could be used for diagnosis together with the list of differentially expressed genes detected by classical methods. Moreover, characterizing genes below or above such a threshold could give new insights into the molecular mechanisms implicated functionally in each condition. Here, we present a simple methodology, based on heuristics, gene filtering, variable transformation and descriptive statistics in order to identify such global gene expression shifts and the characteristic threshold so the same can be applied by any professional that works with expression gene data and not only by statisticians. Our procedure is illustrated on a real gene expression data set comparing pathogen inoculated tomatoes with non-inoculated tomatoes. This methodology can be used for the identification of the threshold values when we have continuous variable data sets from two populations with overlapped distributional forms (histograms) in most of their percentiles.

Share and Cite:

Cuevas, J. , Kleine, L. and Ordoñez, J. (2015) Identification of Global Gene Expression Shifts Using Microarray Data from Different Biological Conditions. Open Journal of Statistics, 5, 360-372. doi: 10.4236/ojs.2015.55038.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] Pop, A., Huttenhower, C., Iyer-Pascussi, A., Benfey, P.N. and Troyanskaya, O.G. (2010) Integrated Functional Networks of Process, Tissue, and Developmental Stage Specific Interactions in Arabidopsis thaliana. BMC Systems Biology, 4, 180.
[2] Pritchard, L. and Birch, P. (2011) A Systems Biology Perspective on Plant-Microbe Interactions: Biochemical and Structural Targets of Pathogen Effectors. Plant Science, 180, 584-603.
[3] López-Kleine, L., Torres-Avilés, F., Tejedor, F.H. and Gordillo L.A. (2012) Virulence Factor Prediction in Streptococcus pyogenes Using Classification and Clustering Based on Microarray Data. Applied Microbiology and Biotechnology, 93, 2091-2098.
[4] López-Kleine, L., Pinzón, A., Chaves, D., Restrepo, S. and Riaño-Pachón, D.M. (2013) Chromosome 10 in the Tomato Plant Carries Clusters of Genes Responsible for Field Resistance/Defence to Phytophthora infestans. Genomics, 101, 249-255.
[5] Huber, W., Heydebreck, A., Sültmann, H., Poustka, A. and Vingron, M. (2002) Variance Stabilization Applied to Microarray Data Calibration and to the Quantification of Differential Expression. Bioinformatics, 18, S96-S104.
[6] Rosner, B. (1986) Fundamentals of Biostatistics. Cengage Learning, Sidney.
[7] Sullivan, M. (2003) The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford Statistical Science Series 31, Oxford University Press, New York.
[8] Osborne, J.W. (2010) Improving Your Data Transformations: Applying the Box-Cox Transformation. Practical Assessment, Research and Evaluation, 15, 1-9.

Copyright © 2023 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.