Estimating the Empirical Null Distribution of Maxmean Statistics in Gene Set Analysis ()
ABSTRACT
Gene Set Analysis (GSA) is a framework for testing the association of a
set of genes and the outcome, e.g. disease status or treatment group. The
method replies on computing a maxmean statistic and estimating the null
distribution of the maxmean statistics via a restandardization procedure. In
practice, the pre-determined gene sets have stronger intra-correlation than
genes across sets. This may result in biases in the estimated null
distribution. We derive an asymptotic null distribution of the maxmean
statistics based on sparsity assumption. We propose a flexible two group
mixture model for the maxmean statistics. The mixture model allows us to
estimate the null parameters empirically via maximum likelihood approach. Our
empirical method is compared with the restandardization procedure of GSA in
simulations. We show that our method is more accurate in null density
estimation when the genes are strongly correlated within gene sets.
Share and Cite:
Ren, X. , Wang, J. , Liu, S. and Miecznikowski, J. (2017) Estimating the Empirical Null Distribution of Maxmean Statistics in Gene Set Analysis.
Open Journal of Statistics,
7, 761-767. doi:
10.4236/ojs.2017.75053.
Cited by
No relevant information.