ON THE INFERENCE OF PARSIMONIOUS INDEL EVOLUTIONARY SCENARIOS
- 1 June 2006
- journal article
- research article
- Published by World Scientific Pub Co Pte Ltd in Journal of Bioinformatics and Computational Biology
- Vol. 4 (3) , 721-744
- https://doi.org/10.1142/s0219720006002168
Abstract
Given a multiple alignment of orthologous DNA sequences and a phylogenetic tree for these sequences, we investigate the problem of reconstructing a most parsimonious scenario of insertions and deletions capable of explaining the gaps observed in the alignment. This problem, called the Indel Parsimony Problem, is a crucial component of the problem of ancestral genome reconstruction, and its solution provides valuable information to many genome functional annotation approaches. We first show that the problem is NP-complete. Second, we provide an algorithm, based on the fractional relaxation of an integer linear programming formulation. The algorithm is fast in practice, and the solutions it produces are, in most cases, provably optimal. We describe a divide-and-conquer approach that makes it possible to solve very large instances on a simple desktop machine, while retaining guaranteed optimality. Our algorithms are tested and shown efficient and accurate on a set of 1.8 Mb mammalian orthologous sequences in the CFTR region.Keywords
This publication has 16 references indexed in Scilit:
- Reconstructing large regions of an ancestral mammalian genome in silicoGenome Research, 2004
- Reconstructing the Genomic Architecture of Ancestral Mammals: Lessons From Human, Mouse, and Rat GenomesGenome Research, 2004
- Regulatory Potential Scores From Genome-Wide Three-Way Alignments of Human, Mouse, and RatGenome Research, 2004
- MAVID: Constrained Ancestral Alignment of Multiple SequencesGenome Research, 2004
- Aligning Multiple Genomic Sequences With the Threaded Blockset AlignerGenome Research, 2004
- An Efficient Algorithm for Statistical Multiple Alignment on Arbitrary Phylogenetic TreesJournal of Computational Biology, 2003
- LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNAGenome Research, 2003
- Geometric Algorithms and Combinatorial OptimizationPublished by Springer Nature ,1993
- Evolutionary trees from DNA sequences: A maximum likelihood approachJournal of Molecular Evolution, 1981
- Toward Defining the Course of Evolution: Minimum Change for a Specific Tree TopologySystematic Zoology, 1971