Combiningp‐values in large‐scale genomics experiments
- 1 July 2007
- journal article
- research article
- Published by Wiley in Pharmaceutical Statistics
- Vol. 6 (3) , 217-226
- https://doi.org/10.1002/pst.304
Abstract
In large‐scale genomics experiments involving thousands of statistical tests, such as association scans and microarray expression experiments, a key question is: Which of theLtests represent true associations (TAs)? The traditional way to control false findings is via individual adjustments. In the presence of multiple TAs,p‐value combination methods offer certain advantages. Both Fisher's and Lancaster's combination methods use an inverse gamma transformation. We identify the relation of the shape parameter of that distribution to the implicit threshold value;p‐values below that threshold are favored by the inverse gamma method (GM). We explore this feature to improve power over Fisher's method whenLis large and the number of TAs is moderate. However, the improvement in power provided by combination methods is at the expense of a weaker claim made upon rejection of the null hypothesis – that there are some TAs among theLtests. Thus, GM remains a global test. To allow a stronger claim about a subset ofp‐values that is smaller thanL, we investigate two methods with an explicit truncation: the rank truncated product method (RTP) that combines the firstK‐orderedp‐values, and the truncated product method (TPM) that combinesp‐values that are smaller than a specified threshold. We conclude that TPM allows claims to be made about subsets ofp‐values, while the claim of the RTP is, like GM, more appropriately about allLtests. GM gives somewhat higher power than TPM, RTP, Fisher, and Simes methods across a range of simulations. Copyright © 2007 John Wiley & Sons, Ltd.Keywords
Funding Information
- National Institutes of Health, National Institute of Environmental Health Sciences
This publication has 16 references indexed in Scilit:
- Ranks of Genuine Associations in Whole-Genome ScansGenetics, 2005
- Genome-wide association studies for common diseases and complex traitsNature Reviews Genetics, 2005
- An efficient Monte Carlo approach to assessing statistical significance in genomic studiesBioinformatics, 2004
- Efficient Computation of Significance Levels for Multiple Associations in Large Studies of Correlated Data, Including Genomewide Association StudiesAmerican Journal of Human Genetics, 2004
- Glucocorticoid-related genetic susceptibility for Alzheimer's diseaseHuman Molecular Genetics, 2003
- Trimming, Weighting, and Grouping SNPs in Human Case-Control Association StudiesGenome Research, 2001
- The Simes Method for Multiple Hypothesis Testing with Positively Dependent Test StatisticsJournal of the American Statistical Association, 1997
- A Consensus Combined P-Value Test and the Family-Wide Significance of Component TestsBiometrics, 1990
- An improved Bonferroni procedure for multiple tests of significanceBiometrika, 1986
- THE COMBINATION OF PROBABILITIES: AN APPLICATION OF ORTHONORMAL FUNCTIONSAustralian Journal of Statistics, 1961