Protein topology and stability define the space of allowed sequences
Open Access
- 22 January 2002
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 99 (3) , 1280-1285
- https://doi.org/10.1073/pnas.032405199
Abstract
We describe a new approach to explore and quantify the sequence space associated with a given protein structure. A set of sequences are optimized for a given target structure, using all-atom models and a physical energy function. Specificity of the sequence for its target is ensured by using the random energy model, which keeps the amino acid composition of the sequence constant. The designed sequences provide a multiple sequence alignment that describes the sequence space compatible with the structure of interest; here the size of this space is estimated by using an information entropy measure. In parallel, multiple alignments of naturally occurring sequences can be derived by using either sequence or structure alignments. We compared these 3 independent multiple sequence alignments for 10 different proteins, ranging in size from 56 to 310 residues. We observed that the subset of the sequence space derived by using our design procedure is similar in size to the sequence spaces observed in nature. These results suggest that the volume of sequence space compatible with a given protein fold is defined by the length of the protein as well as by the topology (i.e., geometry of the polypeptide chain) and the stability (i.e., free energy of denaturation) of the fold.Keywords
This publication has 41 references indexed in Scilit:
- SCOP: A structural classification of proteins database for the investigation of sequences and structuresPublished by Elsevier ,2006
- Automatic protein design with all atom force-fields by exact and heuristic optimization 1 1Edited by J. ThortonJournal of Molecular Biology, 2000
- Comparison of sequence profiles. Strategies for structural predictions using sequence informationProtein Science, 2000
- De novo protein design. I. in search of stability and specificityJournal of Molecular Biology, 1999
- De novo protein design. II. plasticity in sequence spaceJournal of Molecular Biology, 1999
- Designability, thermodynamic stability, and dynamics in protein folding: A lattice model studyThe Journal of Chemical Physics, 1999
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- De novo design of the hydrophobic cores of proteinsProtein Science, 1995
- Application of a Self-consistent Mean Field Theory to Predict Protein Side-chains Conformation and Estimate Their Conformational EntropyJournal of Molecular Biology, 1994
- A new approach to protein fold recognitionNature, 1992