Population stratification and patterns of linkage disequilibrium
- 1 January 2009
- journal article
- research article
- Published by Wiley in Genetic Epidemiology
- Vol. 33 (S1) , S88-S92
- https://doi.org/10.1002/gepi.20478
Abstract
Although the importance of selecting cases and controls from the same population has been recognized for decades, the recent advent of genome‐wide association studies has heightened awareness of this issue. Because these studies typically deal with large samples, small differences in allele frequencies between cases and controls can easily reach statistical significance. When, unbeknownst to a researcher, cases and controls have different substructures, the number of false‐positive findings is inflated. There have been three recent developments of purely statistical approaches to assessing the ancestral comparability of case and control samples: genomic control, structured association, and multivariate reduction analyses. The widespread use of high‐throughput technology has allowed the quick and accurate genotyping of the large number of markers required by these methods. Group 13 dealt with four population stratification issues: single‐nucleotide polymorphism marker selection, association testing, nonstandard methods, and linkage disequilibrium calculations in stratified or mixed ethnicity samples. We demonstrated that there are continuous axes of ethnic variation in both data sets of Genetic Analysis Workshop 16. Furthermore, ignoring this structure created P‐value inflation for a variety of phenotypes. Principal‐components analysis (or multidimensional scaling) can control inflation as covariates in a logistic regression. One can weigh for local ancestry estimation and allow the use of related individuals. Problems arise in the presence of extremely high association or unusually strong linkage disequilibrium (e.g., in chromosomal inversions). Our group also reported a method for performing an association test controlling for substructure, when genome‐wide markers are not available, to explicitly compute stratification Genet. Epidemiol. 33 (Suppl. 1):S88–S92, 2009.Keywords
This publication has 32 references indexed in Scilit:
- Discerning the Ancestry of European Americans in Genetic Association StudiesPLoS Genetics, 2008
- TRAF1–C5as a Risk Locus for Rheumatoid Arthritis — A Genomewide StudyNew England Journal of Medicine, 2007
- PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage AnalysesAmerican Journal of Human Genetics, 2007
- Inference of population structure using multilocus genotype data: dominant markers and null allelesMolecular Ecology Notes, 2007
- Principal components analysis corrects for stratification in genome-wide association studiesNature Genetics, 2006
- Population Structure and EigenanalysisPLoS Genetics, 2006
- Detecting the number of clusters of individuals using the software structure: a simulation studyMolecular Ecology, 2005
- FAST‐TRACK: Integrating QTL mapping and genome scans towards the characterization of candidate loci under parallel selection in the lake whitefish (Coregonus clupeaformis)Molecular Ecology, 2004
- Association Mapping in Structured PopulationsAmerican Journal of Human Genetics, 2000
- Use of Unlinked Genetic Markers to Detect Population Stratification in Association StudiesAmerican Journal of Human Genetics, 1999