A database of protein structure families with common folding motifs
- 31 December 1992
- journal article
- research article
- Published by Wiley in Protein Science
- Vol. 1 (12) , 1691-1698
- https://doi.org/10.1002/pro.5560011217
Abstract
The availability of fast and robust algorithms for protein structure comparison provides an opportunity to produce a database of three‐dimensional comparisons, called families of structurally similar proteins (FSSP). The database currently contains an extended structural family for each of 154 representative (below 30% sequence identity) protein chains. Each data set contains: the search structure; all its relatives with 70–30% sequence identity, aligned structurally; and all other proteins from the representative set that contain substructures significantly similar to the search structure. Very close relatives (above 70% sequence identity) rarely have significant structural differences and are excluded. The alignments of remote relatives are the result of pairwise all‐against‐all structural comparisons in the set of 154 representative protein chains. The comparisons were carried out with each of three novel automatic algorithms that cover different aspects of protein structure similarity. The user of the database has the choice between strict rigid‐body comparisons and comparisons that take into account interdomain motion or geometrical distortions; and, between comparisons that require strictly sequential ordering of segments and comparisons, which allow altered topology of loop connections or chain reversals. The data sets report the structurally equivalent residues in the form of a multiple alignment and as a list of matching fragments to facilitate inspection by three‐dimensional graphics. If substructures are ignored, the result is a database of structure alignments of full‐length proteins, including those in the twilight zone of sequence similarity. The database makes explicitly visible architectural similarities in the known part of the universe of protein folds and may be useful for understanding protein folding and for extracting structural modules for protein design. The data sets are available via Internet.Keywords
This publication has 32 references indexed in Scilit:
- Common spatial arrangements of backbone fragments in homologous and non-homologous proteinsJournal of Molecular Biology, 1992
- Selection of representative protein data setsProtein Science, 1992
- Definition of general topological equivalence in protein structuresJournal of Molecular Biology, 1990
- Use of techniques derived from graph theory to compare secondary structure motifs in proteinsJournal of Molecular Biology, 1990
- Protein structure alignmentJournal of Molecular Biology, 1989
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983
- A systematic approach to the comparison of protein structuresJournal of Molecular Biology, 1980
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977
- Exploring structural homology of proteinsJournal of Molecular Biology, 1976
- Comparison of super-secondary structures in proteinsJournal of Molecular Biology, 1973