Folding free energy function selects native-like protein sequences in the core but not on the surface
- 4 October 2002
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 99 (21) , 13554-13559
- https://doi.org/10.1073/pnas.212068599
Abstract
An automatic protein design procedure is used to select amino acid sequences that optimize the folding free energy function for a given protein. The only information used in designing the sequences is a set of known backbone structures for each protein, a rotamer library, and a well established classical empirical force field, which relies on basic physical chemical principles that underlie molecular interactions and protein stability, and has not been adjusted to yield native-like sequences. Applying the procedure to 7 different known protein folds, representing a total of 45 different native protein structures, yields ensembles of designed sequences displaying remarkable similarity to their natural counterparts in the protein core, but which are distinctly non-native on the protein surface. We show that natural and designed sequences for a given fold score significantly higher than random sequences against profiles derived from both, designed and natural sequence ensembles. Furthermore, we find that designed sequence profiles can be used to retrieve the native sequences for many of the analyzed proteins using standard PSI-BLAST searches in sequence databases. These findings may have important implications for our understanding the selection pressures operating on natural protein sequences and hold promise for improving fold recognition.Keywords
This publication has 41 references indexed in Scilit:
- Prediction of functionally important residues based solely on the computed energetics of protein structure 1 1Edited by B. HonigJournal of Molecular Biology, 2001
- Automatic protein design with all atom force-fields by exact and heuristic optimization 1 1Edited by J. ThortonJournal of Molecular Biology, 2000
- The Protein Data BankNucleic Acids Research, 2000
- De novo protein design. I. in search of stability and specificityJournal of Molecular Biology, 1999
- De novo protein design. II. plasticity in sequence spaceJournal of Molecular Biology, 1999
- Solution conformations and thermodynamics of structured peptides: molecular dynamics simulation with an implicit solvation modelJournal of Molecular Biology, 1998
- All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of ProteinsThe Journal of Physical Chemistry B, 1998
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977