Refining multiple sequence alignments with conserved core regions

Open Access

1 January 2006

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 34 (9) , 2598-2606
https://doi.org/10.1093/nar/gkl274

Abstract

Accurate multiple sequence alignments of proteins are very important to several areas of computational biology and provide an understanding of phylogenetic history of domain families, their identification and classification. This article presents a new algorithm, REFINER, that refines a multiple sequence alignment by iterative realignment of its individual sequences with the predetermined conserved core (block) model of a protein family. Realignment of each sequence can correct misalignments between a given sequence and the rest of the profile and at the same time preserves the family's overall block model. Large-scale benchmarking studies showed a noticeable improvement of alignment after refinement. This can be inferred from the increased alignment score and enhanced sensitivity for database searching using the sequence profiles derived from refined alignments compared with the original alignments. A standalone version of the program is available by ftp distribution ( ftp://ftp.ncbi.nih.gov/pub/REFINER ) and will be incorporated into the next release of the Cn3D structure/alignment viewer.

Keywords

This publication has 27 references indexed in Scilit:

A structure-based method for protein sequence alignment
Bioinformatics, 2004
Evaluation of iterative alignment algorithms for multiple alignment
Bioinformatics, 2004
An adaptive and iterative algorithm for refining multiple sequence alignment
Computational Biology and Chemistry, 2004
Prediction of functional sites by analysis of sequence and structure conservation
Protein Science, 2004
T-coffee: a novel method for fast and accurate multiple sequence alignment 1 1Edited by J. Thornton
Journal of Molecular Biology, 2000
Profile hidden Markov models.
Bioinformatics, 1998
Significant Improvement in Accuracy of Multiple Protein Sequence Alignments by Iterative Refinement as Assessed by Reference to Structural Alignments
Journal of Molecular Biology, 1996
Threading a database of protein cores
Proteins-Structure Function and Bioinformatics, 1995
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice
Nucleic Acids Research, 1994
A novel randomized iterative strategy for aligning multiple protein sequences
Bioinformatics, 1991