HAPLORE: a program for haplotype reconstruction in general pedigrees without recombination
Open Access
- 1 July 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (1) , 90-103
- https://doi.org/10.1093/bioinformatics/bth388
Abstract
Motivation: Haplotype reconstruction is an essential step in genetic linkage and association studies. Although many methods have been developed to estimate haplotype frequencies and reconstruct haplotypes for a sample of unrelated individuals, haplotype reconstruction in large pedigrees with a large number of genetic markers remains a challenging problem. Methods: We have developed an efficient computer program, HAPLORE (HAPLOtype REconstruction), to identify all haplotype sets that are compatible with the observed genotypes in a pedigree for tightly linked genetic markers. HAPLORE consists of three steps that can serve different needs in applications. In the first step, a set of logic rules is used to reduce the number of compatible haplotypes of each individual in the pedigree as much as possible. After this step, the haplotypes of all individuals in the pedigree can be completely or partially determined. These logic rules are applicable to completely linked markers and they can be used to impute missing data and check genotyping errors. In the second step, a haplotype-elimination algorithm similar to the genotype-elimination algorithms used in linkage analysis is applied to delete incompatible haplotypes derived from the first step. All superfluous haplotypes of the pedigree members will be excluded after this step. In the third step, the expectation-maximization (EM) algorithm combined with the partition and ligation technique is used to estimate haplotype frequencies based on the inferred haplotype configurations through the first two steps. Only compatible haplotype configurations with haplotypes having frequencies greater than a threshold are retained. Results: We test the effectiveness and the efficiency of HAPLORE using both simulated and real datasets. Our results show that, the rule-based algorithm is very efficient for completely genotyped pedigree. In this case, almost all of the families have one unique haplotype configuration. In the presence of missing data, the number of compatible haplotypes can be substantially reduced by HAPLORE, and the program will provide all possible haplotype configurations of a pedigree under different circumstances, if such multiple configurations exist. These inferred haplotype configurations, as well as the haplotype frequencies estimated by the EM algorithm, can be used in genetic linkage and association studies. Availability: The program can be downloaded from http://bioinformatics.med.yale.edu Contact:hongyu.zhao@yale.eduKeywords
This publication has 41 references indexed in Scilit:
- Vol. 33, Issue 4, August 2006Transfusion Medicine and Hemotherapy, 2006
- Contents Vol. 21, 2001American Journal of Nephrology, 2001
- Comparisons of Two Methods for Haplotype Reconstruction and Haplotype Frequency Estimation from Population DataAmerican Journal of Human Genetics, 2001
- Inference of Haplotypes from Samples of Diploid Populations: Complexity and AlgorithmsJournal of Computational Biology, 2001
- Haplotyping and estimation of haplotype frequencies for closely linked biallelic multilocus genetic phenotypes including nuclear family informationHuman Mutation, 2001
- An Optimal Algorithm for Automatic Genotype EliminationAmerican Journal of Human Genetics, 1999
- Haplotyping in Pedigrees via a Genetic AlgorithmHuman Heredity, 1999
- An Algorithm for Haplotype AnalysisJournal of Computational Biology, 1997
- Chromlook: An interactive program for error detection and mapping in reference linkage dataGenomics, 1992
- Efficient computation of lod scores: genotype elimination, genotype redefinition, and hybrid maximum likelihood algorithmsAnnals of Human Genetics, 1989