Recurrent use of evolutionary importance for functional annotation of proteins based on local structural similarity
Open Access
- 1 June 2006
- journal article
- Published by Wiley in Protein Science
- Vol. 15 (6) , 1530-1536
- https://doi.org/10.1110/ps.062152706
Abstract
The annotation of protein function has not kept pace with the exponential growth of raw sequence and structure data. An emerging solution to this problem is to identify 3D motifs or templates in protein structures that are necessary and sufficient determinants of function. Here, we demonstrate the recurrent use of evolutionary trace information to construct such 3D templates for enzymes, search for them in other structures, and distinguish true from spurious matches. Serine protease templates built from evolutionarily important residues distinguish between proteases and other proteins nearly as well as the classic Ser‐His‐Asp catalytic triad. In 53 enzymes spanning 33 distinct functions, an automated pipeline identifies functionally related proteins with an average positive predictive power of 62%, including correct matches to proteins with the same function but with low sequence identity (the average identity for some templates is only 17%). Although these template building, searching, and match classification strategies are not yet optimized, their sequential implementation demonstrates a functional annotation pipeline which does not require experimental information, but only local molecular mimicry among a small number of evolutionarily important residues.Keywords
This publication has 45 references indexed in Scilit:
- Using a Library of Structural Templates to Recognise Catalytic Sites and Explore their Evolution in Homologous FamiliesJournal of Molecular Biology, 2005
- NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteinsNucleic Acids Research, 2004
- The Protein Data BankNucleic Acids Research, 2000
- Recognition of spatial motifs in protein structures 1 1Edited by J. ThorntonJournal of Molecular Biology, 1999
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Derivation of 3D coordinate templates for searching structural databases: Application to ser‐His‐Asp catalytic triads in the serine proteinases and lipasesProtein Science, 1996
- An Evolutionary Trace Method Defines Binding Surfaces Common to Protein FamiliesJournal of Molecular Biology, 1996
- Enlarged representative set of protein structuresProtein Science, 1994
- Basic local alignment search toolJournal of Molecular Biology, 1990
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983