Intra- and interpopulation genotype reconstruction from tagging SNPs
- 6 December 2006
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 17 (1) , 96-107
- https://doi.org/10.1101/gr.5741407
Abstract
The optimal method to be used for tSNP selection, the applicability of a reference LD map to unassayed populations, and the scalability of these methods to genome-wide analysis, all remain subjects of debate. We propose novel, scalable matrix algorithms that address these issues and we evaluate them on genotypic data from 38 populations and four genomic regions (248 SNPs typed for ∼2000 individuals). We also evaluate these algorithms on a second data set consisting of genotypes available from the HapMap database (1336 SNPs for four populations) over the same genomic regions. Furthermore, we test these methods in the setting of a real association study using a publicly available family data set. The algorithms we use for tSNP selection and unassayed SNP reconstruction do not require haplotype inference and they are, in principle, scalable even to genome-wide analysis. Moreover, they are greedy variants of recently developed matrix algorithms with provable performance guarantees. Using a small set of carefully selected tSNPs, we achieve very good reconstruction accuracy of “untyped” genotypes for most of the populations studied. Additionally, we demonstrate in a quantitative manner that the chosen tSNPs exhibit substantial transferability, both within and across different geographic regions. Finally, we show that reconstruction can be applied to retrieve significant SNP associations with disease, with important genotyping savings.Keywords
This publication has 57 references indexed in Scilit:
- An Evaluation of the Performance of Tag SNPs Derived from HapMap in a Caucasian PopulationPLoS Genetics, 2006
- A haplotype map of the human genomeNature, 2005
- Efficiency and power in genetic association studiesNature Genetics, 2005
- Haploview: analysis and visualization of LD and haplotype mapsBioinformatics, 2004
- Optimal Haplotype Block-Free Selection of Tagging SNPs for Genome-Wide Association StudiesGenome Research, 2004
- The International HapMap ProjectNature, 2003
- Principal component analysis for selection of optimal SNP‐sets that capture intragenic genetic variationGenetic Epidemiology, 2003
- Selection and Evaluation of Tagging SNPs in the Neuronal-Sodium-Channel Gene SCN1A: Implications for Linkage-Disequilibrium Gene MappingAmerican Journal of Human Genetics, 2003
- Selection of Genetic Markers for Association Analyses, Using Linkage Disequilibrium and HaplotypesAmerican Journal of Human Genetics, 2003
- Population genetics—making sense out of sequenceNature Genetics, 1999