A word-oriented approach to alignment validation
Open Access
- 22 February 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (10) , 2230-2239
- https://doi.org/10.1093/bioinformatics/bti335
Abstract
Motivation: Multiple sequence alignment at the level of whole proteomes requires a high degree of automation, precluding the use of traditional validation methods such as manual curation. Since evolutionary models are too general to describe the history of each residue in a protein family, there is no single algorithm/model combination that can yield a biologically or evolutionarily optimal alignment. We propose a ‘shotgun’ strategy where many different algorithms are used to align the same family, and the best of these alignments is then chosen with a reliable objective function. We present WOOF, a novel ‘word-oriented’ objective function that relies on the identification and scoring of conserved amino acid patterns (words) between pairs of sequences.Keywords
This publication has 32 references indexed in Scilit:
- The Draft Genome of Ciona intestinalis : Insights into Chordate and Vertebrate OriginsScience, 2002
- A study on protein sequence alignment quality.Proteins-Structure Function and Bioinformatics, 2002
- The PROSITE database, its status in 2002Nucleic Acids Research, 2002
- The complexity of multiple sequence alignment with SP-score that is a metricTheoretical Computer Science, 2001
- Selection of Conserved Blocks from Multiple Alignments for Their Use in Phylogenetic AnalysisMolecular Biology and Evolution, 2000
- A symmetric-iterated multiple alignment of protein sequencesJournal of Molecular Biology, 1998
- Significant Improvement in Accuracy of Multiple Protein Sequence Alignments by Iterative Refinement as Assessed by Reference to Structural AlignmentsJournal of Molecular Biology, 1996
- The structural alignment between two proteins: Is there a unique answer?Protein Science, 1996
- Optimum superimposition of protein structures: ambiguities and implicationsFolding and Design, 1996
- The Multiple Sequence Alignment Problem in BiologySIAM Journal on Applied Mathematics, 1988