‘‘Sequence space soup’’ of proteins and copolymers

1 September 1991

journal article
research article
Published by AIP Publishing in The Journal of Chemical Physics

Vol. 95 (5) , 3775-3787
https://doi.org/10.1063/1.460828

Abstract

To study the protein folding problem, we use exhaustive computer enumeration to explore ‘‘sequence space soup,’’ an imaginary solution containing the ‘‘native’’ conformations (i.e., of lowest free energy) under folding conditions, of every possible copolymersequence. The model is of short self‐avoiding chains of hydrophobic (H) and polar (P) monomers configured on the two‐dimensional square lattice. By exhaustive enumeration, we identify all native structures for every possible sequence. We find that random sequences of H/P copolymers will bear striking resemblance to known proteins: Most sequences under folding conditions will be approximately as compact as known proteins, will have considerable amounts of secondary structure, and it is most probable that an arbitrary sequence will fold to a number of lowest free energy conformations that is of order one. In these respects, this simple model shows that proteinlike behavior should arise simply in copolymers in which one monomer type is highly solvent averse. It suggests that the structures and uniquenesses of native proteins are not consequences of having 20 different monomer types, or of unique properties of amino acidmonomers with regard to special packing or interactions, and thus that simple copolymers might be designable to collapse to proteinlike structures and properties. A good strategy for designing a sequence to have a minimum possible number of native states is to strategically insert many P monomers. Thus known proteins may be marginally stable due to a balance: More H residues stabilize the desired native state, but more P residues prevent simultaneous stabilization of undesired native states.

Keywords

This publication has 26 references indexed in Scilit:

The interpretation of protein structures: Total volume, group volume distributions and packing density
Published by Elsevier ,2004
The effects of internal constraints on the configurations of chain molecules
The Journal of Chemical Physics, 1990
Computer Simulations of Globular Protein Folding and Tertiary Structure
Annual Review of Physical Chemistry, 1989
Structure of ubiquitin refined at 1.8 Å resolution
Journal of Molecular Biology, 1987
Structure of papain refined at 1.65 Å resolution
Journal of Molecular Biology, 1984
Structure of thermolysin refined at 1.6 Å resolution
Journal of Molecular Biology, 1982
Refinement of human lysozyme at 1.5 Å resolution analysis of non-bonded and hydrogen-bond interactions
Journal of Molecular Biology, 1981
The protein data bank: A computer-based archival file for macromolecular structures
Journal of Molecular Biology, 1977
Structural invariants in protein folding
Nature, 1975
Atomic coordinates for subtilisin BPN′ (or Novo)
Biochemical and Biophysical Research Communications, 1971