Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner
Top Cited Papers
Open Access
- 1 April 2004
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 14 (4) , 708-715
- https://doi.org/10.1101/gr.1933104
Abstract
We define a “threaded blockset,” which is a novel generalization of the classic notion of a multiple alignment. A new computer program called TBA (for “threaded blockset aligner”) builds a threaded blockset under the assumption that all matching segments occur in the same order and orientation in the given sequences; inversions and duplications are not addressed. TBA is designed to be appropriate for aligning many, but by no means all, megabase-sized regions of multiple mammalian genomes. The output of TBA can be projected onto any genome chosen as a reference, thus guaranteeing that different projections present consistent predictions of which genomic positions are orthologous. This capability is illustrated using a new visualization tool to view TBA-generated alignments of vertebrate Hox clusters from both the mammalian and fish perspectives. Experimental evaluation of alignment quality, using a program that simulates evolutionary change in genomic sequences, indicates that TBA is more accurate than earlier programs. To perform the dynamic-programming alignment step, TBA runs a stand-alone program called MULTIZ, which can be used to align highly rearranged or incompletely sequenced genomes. We describe our use of MULTIZ to produce the whole-genome multiple alignments at the Santa Cruz Genome Browser.Keywords
This publication has 25 references indexed in Scilit:
- Genome sequence of the Brown Norway rat yields insights into mammalian evolutionNature, 2004
- Evolutionary Conservation of Regulatory Elements in Vertebrate Hox Gene ClustersGenome Research, 2003
- A vision for the future of genomics researchNature, 2003
- LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNAGenome Research, 2003
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002
- The Human Genome Browser at UCSCGenome Research, 2002
- Initial sequencing and analysis of the human genomeNature, 2001
- PipMaker—A Web Server for Aligning Two Genomic DNA SequencesGenome Research, 2000
- Approximate Matching of Network Expressions with SpacersJournal of Computational Biology, 1996
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994