Ranks of Genuine Associations in Whole-Genome Scans
Open Access
- 1 October 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Genetics
- Vol. 171 (2) , 813-823
- https://doi.org/10.1534/genetics.105.044206
Abstract
With the recent advances in high-throughput genotyping techniques, it is now possible to perform whole-genome association studies to fine map causal polymorphisms underlying important traits that influence susceptibility to human diseases and efficacy of drugs. Once a genome scan is completed the results can be sorted by the association statistic value. What is the probability that true positives will be encountered among the first most associated markers? When a particular polymorphism is found associated with the trait, there is a chance that it represents either a “true” or a “false” association (TA vs. FA). Setting appropriate significance thresholds has been considered to provide assurance of sufficient odds that the associations found to be significant are genuine. However, the problem with genome scans involving thousands of markers is that the statistic values of FAs can reach quite extreme magnitudes. In such situations, the distributions corresponding to TAs and the most extreme FAs become comparable and significance thresholds tend to penalize TAs and FAs in a similar fashion. When sorting between true and false associations, the “typical” place (i.e., rank) of TAs among the most significant outcomes becomes important, ordered by the association statistic value. The distribution of ranks that we study here allows calculation of several useful quantities. In particular, it gives the number of most significant markers needed for a follow-up study to guarantee that a true association is included with certain probability. This can be calculated conditionally on having applied a multiple-testing correction. Effects of multilocus (e.g., haplotype association) tests and impact of linkage disequilibrium on the distribution of ranks associated with TAs are evaluated and can be taken into account.Keywords
This publication has 32 references indexed in Scilit:
- Bounds and normalization of the composite linkage disequilibrium coefficientGenetic Epidemiology, 2004
- Genome scans and candidate gene approaches in the study of common diseases and variable drug responsesTrends in Genetics, 2003
- Selection of Genetic Markers for Association Analyses, Using Linkage Disequilibrium and HaplotypesAmerican Journal of Human Genetics, 2003
- Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common diseaseNature Genetics, 2003
- Functional SNPs in the lymphotoxin-α gene that are associated with susceptibility to myocardial infarctionNature Genetics, 2002
- On the advantage of haplotype analysis in the presence of multiple disease susceptibility allelesGenetic Epidemiology, 2002
- A Direct Approach to False Discovery RatesJournal of the Royal Statistical Society Series B: Statistical Methodology, 2002
- True and False Positive Peaks in Genomewide Scans: Applications of Length-Biased Sampling to Linkage MappingAmerican Journal of Human Genetics, 1997
- Genetic dissection of complex traits: guidelines for interpreting and reporting linkage resultsNature Genetics, 1995
- Ordinal Measures of AssociationJournal of the American Statistical Association, 1958