Combiningp‐values in large‐scale genomics experiments

1 July 2007

journal article
research article
Published by Wiley in Pharmaceutical Statistics

Vol. 6 (3) , 217-226
https://doi.org/10.1002/pst.304

Abstract

In large‐scale genomics experiments involving thousands of statistical tests, such as association scans and microarray expression experiments, a key question is: Which of theLtests represent true associations (TAs)? The traditional way to control false findings is via individual adjustments. In the presence of multiple TAs,p‐value combination methods offer certain advantages. Both Fisher's and Lancaster's combination methods use an inverse gamma transformation. We identify the relation of the shape parameter of that distribution to the implicit threshold value;p‐values below that threshold are favored by the inverse gamma method (GM). We explore this feature to improve power over Fisher's method whenLis large and the number of TAs is moderate. However, the improvement in power provided by combination methods is at the expense of a weaker claim made upon rejection of the null hypothesis – that there are some TAs among theLtests. Thus, GM remains a global test. To allow a stronger claim about a subset ofp‐values that is smaller thanL, we investigate two methods with an explicit truncation: the rank truncated product method (RTP) that combines the firstK‐orderedp‐values, and the truncated product method (TPM) that combinesp‐values that are smaller than a specified threshold. We conclude that TPM allows claims to be made about subsets ofp‐values, while the claim of the RTP is, like GM, more appropriately about allLtests. GM gives somewhat higher power than TPM, RTP, Fisher, and Simes methods across a range of simulations. Copyright © 2007 John Wiley & Sons, Ltd.

Keywords

Funding Information

National Institutes of Health, National Institute of Environmental Health Sciences

This publication has 16 references indexed in Scilit:

Ranks of Genuine Associations in Whole-Genome Scans
Genetics, 2005
Genome-wide association studies for common diseases and complex traits
Nature Reviews Genetics, 2005
An efficient Monte Carlo approach to assessing statistical significance in genomic studies
Bioinformatics, 2004
Efficient Computation of Significance Levels for Multiple Associations in Large Studies of Correlated Data, Including Genomewide Association Studies
American Journal of Human Genetics, 2004
Glucocorticoid-related genetic susceptibility for Alzheimer's disease
Human Molecular Genetics, 2003
Trimming, Weighting, and Grouping SNPs in Human Case-Control Association Studies
Genome Research, 2001
The Simes Method for Multiple Hypothesis Testing with Positively Dependent Test Statistics
Journal of the American Statistical Association, 1997
A Consensus Combined P-Value Test and the Family-Wide Significance of Component Tests
Biometrics, 1990
An improved Bonferroni procedure for multiple tests of significance
Biometrika, 1986
THE COMBINATION OF PROBABILITIES: AN APPLICATION OF ORTHONORMAL FUNCTIONS
Australian Journal of Statistics, 1961