The Origins of Specificity in Polyketide Synthase Protein Interactions

Abstract
Polyketides, a diverse group of heteropolymers with antibiotic and antitumor properties, are assembled in bacteria by multiprotein chains of modular polyketide synthase (PKS) proteins. Specific protein–protein interactions determine the order of proteins within a multiprotein chain, and thereby the order in which chemically distinct monomers are added to the growing polyketide product. Here we investigate the evolutionary and molecular origins of protein interaction specificity. We focus on the short, conserved N- and C-terminal docking domains that mediate interactions between modular PKS proteins. Our computational analysis, which combines protein sequence data with experimental protein interaction data, reveals a hierarchical interaction specificity code. PKS docking domains are descended from a single ancestral interacting pair, but have split into three phylogenetic classes that are mutually noninteracting. Specificity within one such compatibility class is determined by a few key residues, which can be used to define compatibility subclasses. We identify these residues using a novel, highly sensitive co-evolution detection algorithm called CRoSS (correlated residues of statistical significance). The residue pairs selected by CRoSS are involved in direct physical interactions in a docked-domain NMR structure. A single PKS system can use docking domain pairs from multiple classes, as well as domain pairs from multiple subclasses of any given class. The termini of individual proteins are frequently shuffled, but docking domain pairs straddling two interacting proteins are linked as an evolutionary module. The hierarchical and modular organization of the specificity code is intimately related to the processes by which bacteria generate new PKS pathways. Biomolecular interactions can be extraordinarily specific. In many instances, a protein can select its single correct binding partner from among a large array of closely related candidates. For polyketide synthases (PKSs), a family of bacterial enzymes, such specificity is essential. Like workers on an assembly line, PKSs function as multiprotein chains, each enzyme modifying its substrate before passing it along to the next. And like a well-designed jigsaw puzzle, the overall multiprotein chain is correctly ordered precisely because each component protein can only bind to specific nearest neighbors. A PKS multiprotein chain is held together by sticky “head” and “tail” domains found at either end of each protein, the head of one protein binding to the tail of the next. We looked for patterns in the amino-acid sequences of these domains that could explain why certain head–tail pairs bind, while others do not. We discovered that heads and tails each come in three very different varieties. Mismatched head–tail pairs do not bind at all, while the binding of a matching head–tail pair is governed by the amino acids found at a few key positions on the physical interface between these domains.