De novo and inverse folding predictions of protein structure and dynamics

Abstract
In the last two years, the use of simplified models has facilitated major progress in the globular protein folding problem, viz., the prediction of the three-dimensional (3D) structure of a globular protein from its amino acid sequence. A number of groups have addressed the inverse folding problem where one examines the compatibility of a given sequence with a given (and already determined) structure. A comparison of extant inverse protein-folding algorithms is presented, and methodologies for identifying sequences likely to adopt identical folding topologies, even when they lack sequence homology, are described. Extension to produce structural templates or fingerprints from idealized structures is discussed, and for eight-membered β-barrel proteins, it is shown that idealized fingerprints constructed from simple topology diagrams can correctly identify sequences having the appropriate topology. Furthermore, this inverse folding algorithm is generalized to predict elements of supersecondary structure including β-hairpins, helical hairpins and α/β/α fragments. Then, we describe a very high coordination number lattice model that can predict the 3D structure of a number of globular proteins de novo; i.e. using just the amino acid sequence. Applications to sequences designed by DeGrado and co-workers [Biophys. J., 61 (1992) A265] predict folding intermediates, native states and relative stabilities in accord with experiment. The methodology has also been applied to the four-helix bundle designed by Richardson and co-workers [Science, 249 (1990) 884] and a redesigned monomeric version of a naturally occurring four-helix dimer, rop. Based on comparison to the rop dimer, the simulations predict conformations with rms values of 3–4 Å from native. Furthermore, the de novo algorithms can asses the stability of the folds predicted from the inverse algorithm, while the inverse folding algorithms can assess the quality of the de novo models. Thus, the synergism of the de novo and inverse folding algorthhm approaches provides a set of complementary tools that will facilitate further progress on the protein-folding problem.