Haplotype Block Partitioning and Tag SNP Selection Using Genotype Data and Their Applications to Association Studies
Open Access
- 12 April 2004
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 14 (5) , 908-916
- https://doi.org/10.1101/gr.1837404
Abstract
Recent studies have revealed that linkage disequilibrium (LD) patterns vary across the human genome with some regions of high LD interspersed by regions of low LD. A small fraction of SNPs (tag SNPs) is sufficient to capture most of the haplotype structure of the human genome. In this paper, we develop a method to partition haplotypes into blocks and to identify tag SNPs based on genotype data by combining a dynamic programming algorithm for haplotype block partitioning and tag SNP selection based on haplotype data with a variation of the expectation maximization (EM) algorithm for haplotype inference. We assess the effects of using either haplotype or genotype data in haplotype block identification and tag SNP selection as a function of several factors, including sample size, density or number of SNPs studied, allele frequencies, fraction of missing data, and genotyping error rate, using extensive simulations. We find that a modest number of haplotype or genotype samples will result in consistent block partitions and tag SNP selection. The power of association studies based on tag SNPs using genotype data is similar to that using haplotype data.Keywords
This publication has 55 references indexed in Scilit:
- Finding Haplotype Block Boundaries by Using the Minimum-Description-Length PrincipleAmerican Journal of Human Genetics, 2003
- Haplotype Block Partition with Limited Resources and Applications to Human Chromosome 21 Haplotype DataAmerican Journal of Human Genetics, 2003
- On the use of DNA pooling to estimate haplotype frequenciesGenetic Epidemiology, 2002
- DNA Pooling: a tool for large-scale association studiesNature Reviews Genetics, 2002
- Haplotype Inference in Random Population SamplesAmerican Journal of Human Genetics, 2002
- The impact of genotyping error on haplotype reconstruction and frequency estimationEuropean Journal of Human Genetics, 2002
- Direct measurement of the male recombination fraction in the human beta-globin hot spotHuman Molecular Genetics, 2002
- Inference of Haplotypes from Samples of Diploid Populations: Complexity and AlgorithmsJournal of Computational Biology, 2001
- Accuracy of Haplotype Frequency Estimation for Biallelic Loci, via the Expectation-Maximization Algorithm for Unphased Diploid Genotype DataAmerican Journal of Human Genetics, 2000
- The Accuracy of Statistical Methods for Estimation of Haplotype Frequencies: An Example from the CD4 LocusAmerican Journal of Human Genetics, 2000