A MARKOV CHAIN APPROACH TO RECONSTRUCTION OF LONG HAPLOTYPES

Abstract
Haplotypes are important for association based gene mapping, but there are no practical laboratory methods for obtaining them directly from DNA samples. We propose simple Markov models for reconstruction of haplotypes for a given sample of multilocus genotypes. The models are aimed specifically for long marker maps, where linkage disequilibrium between markers may vary and be relatively weak. Such maps are ultimately used in chromosome or genome-wide association studies. Haplotype reconstruction with standard Markov chains is based on linkage disequilibrium (LD) between neighboring markers. Markov chains of higher order can capture LD in a neighborhood of a given size. We introduce a more flexible and robust model, MC-VL, which is based on a Markov chain of variable order. Experimental validation of the Markov chain methods on both a wide range of simulated data and real data shows that they clearly outperform previous methods on genetically long marker maps and are highly competitive with short maps, too. MC-VL performs well across different data sets and settings while avoiding the problem of manually choosing an appropriate order for the Markov chain, and it has low computational complexity.

This publication has 0 references indexed in Scilit: