Variance of the Number of False Discoveries
- 24 May 2005
- journal article
- Published by Oxford University Press (OUP) in Journal of the Royal Statistical Society Series B: Statistical Methodology
- Vol. 67 (3) , 411-426
- https://doi.org/10.1111/j.1467-9868.2005.00509.x
Abstract
Summary: In high throughput genomic work, a very large number d of hypotheses are tested based on n≪d data samples. The large number of tests necessitates an adjustment for false discoveries in which a true null hypothesis was rejected. The expected number of false discoveries is easy to obtain. Dependences between the hypothesis tests greatly affect the variance of the number of false discoveries. Assuming that the tests are independent gives an inadequate variance formula. The paper presents a variance formula that takes account of the correlations between test statistics. That formula involves O(d2) correlations, and so a naïve implementation has cost O(nd2). A method based on sampling pairs of tests allows the variance to be approximated at a cost that is independent of d.Keywords
Funding Information
- US National Science Foundation (DMS-0306612)
This publication has 12 references indexed in Scilit:
- Controlling the number of false discoveries: application to high-dimensional genomic dataJournal of Statistical Planning and Inference, 2004
- A stochastic process approach to false discovery controlThe Annals of Statistics, 2004
- Diverse and Specific Gene Expression Responses to Stresses in Cultured Human CellsMolecular Biology of the Cell, 2004
- Strong Control, Conservative Point Estimation and Simultaneous Conservative Consistency of False Discovery Rates: A Unified ApproachJournal of the Royal Statistical Society Series B: Statistical Methodology, 2003
- Multiple hypotheses testing and expected number of type I. errorsThe Annals of Statistics, 2002
- The control of the false discovery rate in multiple testing under dependencyThe Annals of Statistics, 2001
- Graphical ModelsPublished by Oxford University Press (OUP) ,1996
- QuadpackPublished by Springer Nature ,1983
- Plots of P-values to evaluate many tests simultaneouslyBiometrika, 1982
- Theoretical StatisticsPublished by Springer Nature ,1974