False discovery rate, sensitivity and sample size for microarray studies
Top Cited Papers
Open Access
- 19 April 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (13) , 3017-3024
- https://doi.org/10.1093/bioinformatics/bti448
Abstract
Motivation: In microarray data studies most researchers are keenly aware of the potentially high rate of false positives and the need to control it. One key statistical shift is the move away from the well-known P-value to false discovery rate (FDR). Less discussion perhaps has been spent on the sensitivity or the associated false negative rate (FNR). The purpose of this paper is to explain in simple ways why the shift from P-value to FDR for statistical assessment of microarray data is necessary, to elucidate the determining factors of FDR and, for a two-sample comparative study, to discuss its control via sample size at the design stage. Results: We use a mixture model, involving differentially expressed (DE) and non-DE genes, that captures the most common problem of finding DE genes. Factors determining FDR are (1) the proportion of truly differentially expressed genes, (2) the distribution of the true differences, (3) measurement variability and (4) sample size. Many current small microarray studies are plagued with large FDR, but controlling FDR alone can lead to unacceptably large FNR. In evaluating a design of a microarray study, sensitivity or FNR curves should be computed routinely together with FDR curves. Under certain assumptions, the FDR and FNR curves coincide, thus simplifying the choice of sample size for controlling the FDR and FNR jointly. Availability: R-package OCplus for computing FDR, sensitivity curves and sample size is freely available at http://www.meb.ki.se/~yudpaw Contact:yudi.pawitan@meb.ki.seKeywords
This publication has 33 references indexed in Scilit:
- Optimal Sample Size for Multiple TestingJournal of the American Statistical Association, 2004
- Large-Scale Simultaneous Hypothesis TestingJournal of the American Statistical Association, 2004
- Statistical significance for genomewide studiesProceedings of the National Academy of Sciences, 2003
- Identifying differentially expressed genes using false discovery rate controlling proceduresBioinformatics, 2003
- A Direct Approach to False Discovery RatesJournal of the Royal Statistical Society Series B: Statistical Methodology, 2002
- Empirical bayes methods and false discovery rates for microarraysGenetic Epidemiology, 2002
- Controlling the rate of Type I error over a large set of statistical testsBritish Journal of Mathematical and Statistical Psychology, 2002
- Significance analysis of microarrays applied to the ionizing radiation responseProceedings of the National Academy of Sciences, 2001
- On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent StatisticsJournal of Educational and Behavioral Statistics, 2000
- Microarray Analysis of Drosophila Development During MetamorphosisScience, 1999