A Method to Address Differential Bias in Genotyping in Large-Scale Association Studies
Open Access
- 18 May 2007
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Genetics
- Vol. 3 (5) , e74
- https://doi.org/10.1371/journal.pgen.0030074
Abstract
In a previous paper we have shown that, when DNA samples for cases and controls are prepared in different laboratories prior to high-throughput genotyping, scoring inaccuracies can lead to differential misclassification and, consequently, to increased false-positive rates. Different DNA sourcing is often unavoidable in large-scale disease association studies of multiple case and control sets. Here, we describe methodological improvements to minimise such biases. These fall into two categories: improvements to the basic clustering methods for identifying genotypes from fluorescence intensities, and use of “fuzzy” calls in association tests in order to make appropriate allowance for call uncertainty. We find that the main improvement is a modification of the calling algorithm that links the clustering of cases and controls while allowing for different DNA sourcing. We also find that, in the presence of different DNA sourcing, biases associated with missing data can increase the false-positive rate. Therefore, we propose the use of “fuzzy” calls to deal with uncertain genotypes that would otherwise be labeled as missing. Genome-wide disease association studies are becoming more common and involve genotyping cases and controls at a large number of SNP markers spread throughout the genome. We have shown previously that such studies can have an inflated false-positive rate, the result of genotype calling inaccuracies when DNA samples for cases and controls were prepared in different laboratories, prior to genotyping. Different DNA sourcing is often unavoidable in the large-scale association studies of multiple case and control sets. Here we describe methodological improvements to minimise such biases. These fall into two categories: improvements to the basic clustering methods for calling genotypes from fluorescence intensities, and use of “fuzzy” calls in association tests in order to make appropriate allowance for call uncertainty.Keywords
This publication has 11 references indexed in Scilit:
- Optimal genotype determination in highly multiplexed SNP dataEuropean Journal of Human Genetics, 2005
- A haplotype map of the human genomeNature, 2005
- Population structure, differential bias and genomic control in a large-scale, case-control association studyNature Genetics, 2005
- Cohort profile: 1958 British birth cohort (National Child Development Study)International Journal of Epidemiology, 2005
- Highly multiplexed molecular inversion probe genotyping: Over 10,000 targeted SNPs genotyped in a single tube assayGenome Research, 2005
- Genome-wide association studies: theoretical and practical concernsNature Reviews Genetics, 2005
- Incorporating Genotyping Uncertainty in Haplotype Inference for Single-Nucleotide PolymorphismsAmerican Journal of Human Genetics, 2004
- Detecting Disease Associations due to Linkage Disequilibrium Using Haplotype Tags: A Class of Tests and the Determinants of Statistical PowerHuman Heredity, 2003
- Multiplexed genotyping with sequence-tagged molecular inversion probesNature Biotechnology, 2003
- Finite Mixture ModelsPublished by Wiley ,2000