Extracting protein alignment models from the sequence database
Open Access
- 1 May 1997
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 25 (9) , 1665-1677
- https://doi.org/10.1093/nar/25.9.1665
Abstract
Biologists often gain structural and functional insights into a protein sequence by constructing a multiple alignment model of the family. Here a program called Probe fully automates this process of model construction starting from a single sequence. Central to this program is a powerful new method to locate and align only those, often subtly, conserved patterns essential to the family as a whole. When applied to randomly chosen proteins, Probe found on average about four times as many relationships as a pairwise search and yielded many new discoveries. These include: an obscure subfamily of globins in the roundworm Caenorhabditis elegans; two new superfamilies of metallohydrolases; a lipoyl/biotin swinging arm domain in bacterial membrane fusion proteins; and a DH domain in the yeast Bud3 and Fus2 proteins. By identifying distant relationships and merging families into superfamilies in this way, this analysis further confirms the notion that proteins evolved from relatively few ancient sequences. Moreover, this method automatically generates models of these ancient conserved regions for rapid and sensitive screening of sequences.Keywords
This publication has 85 references indexed in Scilit:
- Helicases: amino acid sequence comparisons and structure-function relationshipsPublished by Elsevier ,2005
- Statistics of local complexity in amino acid sequences and sequence databasesPublished by Elsevier ,2001
- Molecular evolutionary analysis of theywvz/7B globin gene cluster of the insectChironomus thummiJournal of Molecular Evolution, 1995
- Maximum Discrimination Hidden Markov Models of Sequence ConsensusJournal of Computational Biology, 1995
- Three-dimensional structure of a lipoyl domain fromthe dihydrolipoyl acetyltransferase component of the pyruvate dehydrogenase multienzyme complex of Escherichia coliJournal of Molecular Biology, 1995
- Mutation in the DNA mismatch repair gene homologue hMLH 1 is associated with hereditary non-polyposis colon cancerNature, 1994
- Hidden Markov Models in Computational BiologyJournal of Molecular Biology, 1994
- One thousand families for the molecular biologistNature, 1992
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990
- Basic local alignment search toolJournal of Molecular Biology, 1990