Protein fold recognition using sequence‐derived predictions
Open Access
- 1 May 1996
- journal article
- research article
- Published by Wiley in Protein Science
- Vol. 5 (5) , 947-955
- https://doi.org/10.1002/pro.5560050516
Abstract
In protein fold recognition, one assigns a probe amino acid sequence of unknown structure to one of a library of target 3D structures. Correct assignment depends on effective scoring of the probe sequence for its compatibility with each of the target structures. Here we show that, in addition to the amino acid sequence of the probe, sequence-derived properties of the probe sequence (such as the predicted secondary structure) are useful in fold assignment. The additional measure of compatibility between probe and target is the level of agreement between the predicted secondary structure of the probe and the known secondary structure of the target fold. That is, we recommend a sequence-structure compatibility function that combines previously developed compatibility functions (such as the 3D-1D scores of Bowie et al. [1991] or sequence-sequence replacement tables) with the predicted secondary structure of the probe sequence. The effect on fold assignment of adding predicted secondary structure is evaluated here by using a benchmark set of proteins (Fischer et al., 1996a). The 3D structures of the probe sequences of the benchmark are actually known, but are ignored by our method. The results show that the inclusion of the predicted secondary structure improves fold assignment by about 25%. The results also show that, if the true secondary structure of the probe were known, correct fold assignment would increase by an additional 8–32%. We conclude that incorporating sequence-derived predictions significantly improves assignment of sequences to known 3D folds. Finally, we apply the new method to assign folds to sequences in the SWISSPROT database; six fold assignments are given that are not detectable by standard sequence-sequence comparison methods; for two of these, the fold is known from X-ray crystallography and the fold assignment is correct.Keywords
This publication has 32 references indexed in Scilit:
- Amino acid substitution matrices from an information theoretic perspectivePublished by Elsevier ,2005
- Identification of common molecular subsequencesPublished by Elsevier ,2004
- Factors Influencing the Ability of Knowledge-based Potentials to Identify Native Sequence-Structure MatchesJournal of Molecular Biology, 1994
- Prediction of Protein Structure by Evaluation of Sequence-structure Fitness: Aligning Sequences to Contact Profiles Derived from Three-dimensional StructuresJournal of Molecular Biology, 1993
- Prediction of Protein Secondary Structure at Better than 70% AccuracyJournal of Molecular Biology, 1993
- Topology fingerprint approach to the inverse protein folding problemJournal of Molecular Biology, 1992
- A new approach to protein fold recognitionNature, 1992
- Basic local alignment search toolJournal of Molecular Biology, 1990
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970