‘‘Sequence space soup’’ of proteins and copolymers
- 1 September 1991
- journal article
- research article
- Published by AIP Publishing in The Journal of Chemical Physics
- Vol. 95 (5) , 3775-3787
- https://doi.org/10.1063/1.460828
Abstract
To study the protein folding problem, we use exhaustive computer enumeration to explore ‘‘sequence space soup,’’ an imaginary solution containing the ‘‘native’’ conformations (i.e., of lowest free energy) under folding conditions, of every possible copolymersequence. The model is of short self‐avoiding chains of hydrophobic (H) and polar (P) monomers configured on the two‐dimensional square lattice. By exhaustive enumeration, we identify all native structures for every possible sequence. We find that random sequences of H/P copolymers will bear striking resemblance to known proteins: Most sequences under folding conditions will be approximately as compact as known proteins, will have considerable amounts of secondary structure, and it is most probable that an arbitrary sequence will fold to a number of lowest free energy conformations that is of order one. In these respects, this simple model shows that proteinlike behavior should arise simply in copolymers in which one monomer type is highly solvent averse. It suggests that the structures and uniquenesses of native proteins are not consequences of having 20 different monomer types, or of unique properties of amino acidmonomers with regard to special packing or interactions, and thus that simple copolymers might be designable to collapse to proteinlike structures and properties. A good strategy for designing a sequence to have a minimum possible number of native states is to strategically insert many P monomers. Thus known proteins may be marginally stable due to a balance: More H residues stabilize the desired native state, but more P residues prevent simultaneous stabilization of undesired native states.Keywords
This publication has 26 references indexed in Scilit:
- The interpretation of protein structures: Total volume, group volume distributions and packing densityPublished by Elsevier ,2004
- The effects of internal constraints on the configurations of chain moleculesThe Journal of Chemical Physics, 1990
- Computer Simulations of Globular Protein Folding and Tertiary StructureAnnual Review of Physical Chemistry, 1989
- Structure of ubiquitin refined at 1.8 Å resolutionJournal of Molecular Biology, 1987
- Structure of papain refined at 1.65 Å resolutionJournal of Molecular Biology, 1984
- Structure of thermolysin refined at 1.6 Å resolutionJournal of Molecular Biology, 1982
- Refinement of human lysozyme at 1.5 Å resolution analysis of non-bonded and hydrogen-bond interactionsJournal of Molecular Biology, 1981
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977
- Structural invariants in protein foldingNature, 1975
- Atomic coordinates for subtilisin BPN′ (or Novo)Biochemical and Biophysical Research Communications, 1971