Estimating the Empirical Null Distribution of Maxmean Statistics in Gene Set Analysis

HTML  XML Download Download as PDF (Size: 449KB)  PP. 761-767  
DOI: 10.4236/ojs.2017.75053    943 Downloads   1,742 Views  

ABSTRACT

Gene Set Analysis (GSA) is a framework for testing the association of a set of genes and the outcome, e.g. disease status or treatment group. The method replies on computing a maxmean statistic and estimating the null distribution of the maxmean statistics via a restandardization procedure. In practice, the pre-determined gene sets have stronger intra-correlation than genes across sets. This may result in biases in the estimated null distribution. We derive an asymptotic null distribution of the maxmean statistics based on sparsity assumption. We propose a flexible two group mixture model for the maxmean statistics. The mixture model allows us to estimate the null parameters empirically via maximum likelihood approach. Our empirical method is compared with the restandardization procedure of GSA in simulations. We show that our method is more accurate in null density estimation when the genes are strongly correlated within gene sets.

Share and Cite:

Ren, X. , Wang, J. , Liu, S. and Miecznikowski, J. (2017) Estimating the Empirical Null Distribution of Maxmean Statistics in Gene Set Analysis. Open Journal of Statistics, 7, 761-767. doi: 10.4236/ojs.2017.75053.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.