Knowledge-based potential defined for a rotamer library to design protein sequences
Open Access
- 1 August 2001
- journal article
- research article
- Published by Oxford University Press (OUP) in Protein Engineering, Design and Selection
- Vol. 14 (8) , 557-564
- https://doi.org/10.1093/protein/14.8.557
Abstract
A knowledge-based potential for a rotamer library was developed to design protein sequences. Protein side-chain conformations are represented by 56 templates. Each of their fitness to a given structural site-environment is evaluated by a combined function of the three knowledge-based terms, i.e. two-body side-chain packing, one-body hydration and local conformation. The number of matches between the native sequence and the structural site-environment in the database and that of the virtually settled mismatches, counted in advance, were transformed into the energy scores. In the best-14 test (assessment for the reproduction ability of the native rotamer on its structural site within a quarter of 56 fitness rank positions), the structural stability analysis on mutants of human and T4 lysozymes and the inverse-folding search by a structure profile against the sequence database, this function performs better than the function deduced with the conventional normalization and our previously developed function. Targeting various structural motifs, de novo sequence design was conducted with the function. The sequences thus obtained exhibit reasonable molecular masses and hydrophobic/hydrophilic patterns similar to the native sequences of the target and act as if they were the homologs to the target proteins in BLASTP search. This significant improvement is discussed in terms of the reference state for normalization and the crucial role of short-range repulsion to prohibit residue bumps.Keywords
This publication has 50 references indexed in Scilit:
- The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000Nucleic Acids Research, 2000
- De novo design of a monomeric three‐stranded antiparallel β‐sheetProtein Science, 1999
- De Novo Protein Design: Fully Automated Sequence SelectionScience, 1997
- Statistical potentials extracted from protein structures: Are these meaningful potentials?The Journal of Chemical Physics, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- De Novo Design of Native Proteins: Characterization of Proteins Intended To Fold into Antiparallel, Rop-like, Four-Helix BundlesBiochemistry, 1997
- Protein design automationProtein Science, 1996
- The dead-end elimination theorem and its use in protein side-chain positioningNature, 1992
- A Method to Identify Protein Sequences That Fold into a Known Three-Dimensional StructureScience, 1991
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977