Relationships between sequence and structure for the four-α-helix bundle tertiary motif in proteins

Abstract
The sequences of four-alpha-helical bundle proteins are characterized by a pattern of hydrophilic and hydrophobic amino acids which is repeated every seven residues. At each position of the heptad repeat there are specific constraints on the amino acid properties which result from the topology of the tertiary motif. These constraints give rise to patterns of amino acid distribution which are distinct from those of other proteins. The distributions in each of the heptad positions have been determined by a statistical analysis of structural and sequence data derived from seven families of aligned protein sequences. The constitution of each position is dominated by a very small number of different amino acids, with the core positions consisting overwhelmingly of Leu and Ala. The positional preferences of the individual amino acids can be generally interpreted in terms of residue properties and topological constraints. The potential for four-alpha-helix bundle folding is reflected primarily in the pattern of residue occurrence in the heptad and not in the overall amino acid composition of the protein. Possible applications of this analysis in structure predictions, sequence alignments and in the rational design and engineering of four-alpha-helical bundle proteins are discussed.

This publication has 0 references indexed in Scilit: