Linkage disequilibrium assessment via log‐linear modeling of SNP haplotype frequencies

Abstract
Analyses of high‐density single‐nucleotide polymorphism (SNP) data, such as genetic mapping and linkage disequilibrium (LD) studies, require phase‐known haplotypes to allow for the correlation between tightly linked loci. However, current SNP genotyping technology cannot determine phase, which must be inferred statistically. In this paper, we present a new Bayesian Markov chain Monte Carlo (MCMC) algorithm for population haplotype frequency estimation, particulary in the context of LD assessment. The novel feature of the method is the incorporation of a log‐linear prior model for population haplotype frequencies. We present simulations to suggest that 1) the log‐linear prior model is more appropriate than the standard coalescent process in the presence of recombination (>0.02 cM between adjacent loci), and 2) there is substantial inflation in measures of LD obtained by a “two‐stage” approach to the analysis by treating the “best” haplotype configuration as correct, without regard to uncertainty in the recombination process.Genet Epidemiol25:106–114, 2003.