High solubility of random-sequence proteins consisting of five kinds of primitive amino acids
Open Access
- 31 May 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Protein Engineering, Design and Selection
- Vol. 18 (6) , 279-284
- https://doi.org/10.1093/protein/gzi034
Abstract
Searching for functional proteins among random-sequence libraries is a major challenge of protein engineering; the difficulties include the poor solubility of many random-sequence proteins. A library in which most of the polypeptides are soluble and stable would therefore be of great benefit. Although modern proteins consist of 20 amino acids, it has been suggested that early proteins evolved from a reduced alphabet. Here, we have constructed a library of random-sequence proteins consisting of only five amino acids, Ala, Gly, Val, Asp and Glu, which are believed to have been the most abundant in the prebiotic environment. Expression and characterization of arbitrarily chosen proteins in the library indicated that five-alphabet random-sequence proteins have higher solubility than do 20-alphabet random-sequence proteins with a similar level of hydrophobicity. The results support the reduced-alphabet hypothesis of the primordial genetic code and should also be helpful in constructing optimized protein libraries for evolutionary protein engineering.Keywords
This publication has 53 references indexed in Scilit:
- Evolutionary Optimization of a Nonbiological ATP Binding Protein for Improved Folding StabilityChemistry & Biology, 2004
- Combinatorial mutagenesis to restrict amino acid usage in an enzyme to a reduced setProceedings of the National Academy of Sciences, 2002
- Evolution of Amino Acid Frequencies in Proteins Over Deep Time: Inferred Order of Introduction of Amino Acids into the Genetic CodeMolecular Biology and Evolution, 2002
- Constructing high complexity synthetic libraries of long ORFs using In Vitro selectionJournal of Molecular Biology, 2000
- Folding alphabets.Nature Structural & Molecular Biology, 1999
- Neutral networks in protein space: a computational study based on knowledge-based potentials of mean forceFolding and Design, 1997
- Cooperatively folded proteins in random sequence librariesNature Structural & Molecular Biology, 1995
- Folded proteins occur frequently in libraries of random amino acid sequences.Proceedings of the National Academy of Sciences, 1994
- Empirical Predictions of Protein ConformationAnnual Review of Biochemistry, 1978
- The origin of the genetic codeJournal of Molecular Biology, 1968