MAVID: Constrained Ancestral Alignment of Multiple Sequences
Open Access
- 1 April 2004
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 14 (4) , 693-699
- https://doi.org/10.1101/gr.1960404
Abstract
We describe a new global multiple-alignment program capable of aligning a large number of genomic regions. Our progressive-alignment approach incorporates the following ideas: maximum-likelihood inference of ancestral sequences, automatic guide-tree construction, protein-based anchoring of ab-initio gene predictions, and constraints derived from a global homology map of the sequences. We have implemented these ideas in the MAVID program, which is able to accurately align multiple genomic regions up to megabases long. MAVID is able to effectively align divergent sequences, as well as incomplete unfinished sequences. We demonstrate the capabilities of the program on the benchmark CFTR region, which consists of 1.8 Mb of human sequence and 20 orthologous regions in marsupials, birds, fish, and mammals. Finally, we describe two large MAVID alignments, an alignment of all the available HIV genomes and a multiple alignment of the entire human, mouse, and rat genomes.Keywords
All Related Versions
This publication has 32 references indexed in Scilit:
- LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNAGenome Research, 2003
- A Structural EM Algorithm for Phylogenetic InferenceJournal of Computational Biology, 2002
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002
- Statistical alignment: computational properties, homology testing and goodness-of-fit 1 1Edited by J. KarnJournal of Molecular Biology, 2000
- The Number of Multiple AlignmentsMolecular Phylogenetics and Evolution, 1998
- Prediction of complete gene structures in human genomic DNAJournal of Molecular Biology, 1997
- Significant Improvement in Accuracy of Multiple Protein Sequence Alignments by Iterative Refinement as Assessed by Reference to Structural AlignmentsJournal of Molecular Biology, 1996
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- Optimal alignment between groups of sequences and its application to multiple sequence alignmentBioinformatics, 1993
- Evolutionary trees from DNA sequences: A maximum likelihood approachJournal of Molecular Evolution, 1981