Recursive domains in proteins

Abstract
The domain is a fundamental unit of protein structure. Numerous studies have analyzed folding patterns in protein domains of known structure to gain insight into the underlying protein folding process. Are such patterns a haphazard assortment or are they similar to sentences in a language, which can be generated by an underlying grammar? Specifically, can a small number of intuitively sensible rules generate a large class of folds, including feasible new folds? In this paper, we explore the extent to which four simple rules can generate the known all-β folds, using tools from graph theory. As a control, an exhaustive set of β-sandwiches was tested and found to be largely incompatible with such a grammar. The existence of a protein grammar has potential implications for both the mechanism of folding and the evolution of domains.