A simple and fast two‐locus quality control test to detect false positives due to batch effects in genome‐wide association studies
Open Access
- 22 November 2010
- journal article
- research article
- Published by Wiley in Genetic Epidemiology
- Vol. 34 (8) , 854-862
- https://doi.org/10.1002/gepi.20541
Abstract
The impact of erroneous genotypes having passed standard quality control (QC) can be severe in genome‐wide association studies, genotype imputation, and estimation of heritability and prediction of genetic risk based on single nucleotide polymorphisms (SNP). To detect such genotyping errors, a simple two‐locus QC method, based on the difference in test statistic of association between single SNPs and pairs of SNPs, was developed and applied. The proposed approach could detect many problematic SNPs with statistical significance even when standard single SNP QC analyses fail to detect them in real data. Depending on the data set used, the number of erroneous SNPs that were not filtered out by standard single SNP QC but detected by the proposed approach varied from a few hundred to thousands. Using simulated data, it was shown that the proposed method was powerful and performed better than other tested existing methods. The power of the proposed approach to detect erroneous genotypes was ∼80% for a 3% error rate per SNP. This novel QC approach is easy to implement and computationally efficient, and can lead to a better quality of genotypes for subsequent genotype‐phenotype investigations. Genet. Epidemiol. 34:854–862, 2010.Keywords
This publication has 22 references indexed in Scilit:
- Predicting Unobserved Phenotypes for Complex Traits from Whole-Genome SNP DataPLoS Genetics, 2008
- PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage AnalysesAmerican Journal of Human Genetics, 2007
- Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controlsNature, 2007
- WHAP: haplotype-based association analysisBioinformatics, 2006
- Contrasting Linkage-Disequilibrium Patterns between Cases and Controls as a Novel Association-Mapping MethodAmerican Journal of Human Genetics, 2006
- The Impact of Missing and Erroneous Genotypes on Tagging SNP Selection and Power of Subsequent Association TestsHuman Heredity, 2006
- Assumption-Free Estimation of Heritability from Genome-Wide Identity-by-Descent Sharing between Full SiblingsPLoS Genetics, 2006
- Identification of probable genotyping errors by consideration of haplotypesEuropean Journal of Human Genetics, 2006
- Genomewide Linkage Study in 1,176 Affected Sister Pair Families Identifies a Significant Susceptibility Locus for Endometriosis on Chromosome 10q26American Journal of Human Genetics, 2005
- The international endogene study: a collection of families for genetic research in endometriosisFertility and Sterility, 2002