A gold standard set of mechanistically diverse enzyme superfamilies
Open Access
- 31 January 2006
- journal article
- research article
- Published by Springer Nature in Genome Biology
- Vol. 7 (1) , R8
- https://doi.org/10.1186/gb-2006-7-1-r8
Abstract
Superfamily and family analyses provide an effective tool for the functional classification of proteins, but must be automated for use on large datasets. We describe a 'gold standard' set of enzyme superfamilies, clustered according to specific sequence, structure, and functional criteria, for use in the validation of family and superfamily clustering methods. The gold standard set represents four fold classes and differing clustering difficulties, and includes five superfamilies, 91 families, 4,887 sequences and 282 structures.Keywords
This publication has 56 references indexed in Scilit:
- Improving Profile HMM Discrimination by Adapting Transition ProbabilitiesJournal of Molecular Biology, 2004
- The Pfam protein families databaseNucleic Acids Research, 2004
- Structure-based Active Site Profiles for Genome Analysis and Functional Family SubclassificationJournal of Molecular Biology, 2003
- How Well is Enzyme Function Conserved as a Function of Pairwise Sequence Identity?Journal of Molecular Biology, 2003
- An efficient algorithm for large-scale detection of protein familiesNucleic Acids Research, 2002
- Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structureJournal of Molecular Biology, 2001
- The relationship between protein structure and function: a comprehensive survey with application to the yeast genomeJournal of Molecular Biology, 1999
- Unexpected Divergence of Enzyme Function and Sequence: “N-Acylamino Acid Racemase” Is o-Succinylbenzoate SynthaseBiochemistry, 1999
- Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methodsJournal of Molecular Biology, 1998
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997