A consensus prediction of the secondary structure for the 6‐phospho‐β‐D‐galactosidase superfamily

Abstract
Two separate unrefined models for the secondary structure of two subfamilies of the 6-phospho-β-D-galactosidase superfamily were independently constructed by examining patterns of variation and conservation within homologous protein sequences, assigning surface, interior, parsing, and active site residues to positions in the alignment, and identifying periodicities in these. A consensus model for the secondary structure of the entire superfamily was then built. The prediction tests the limits of an unrefined prediction made using this approach in a large protein with substantial functional and sequence divergence within the family. The protein belongs to the (α–β class), with the core β strands aligned parallel. The supersecondary structural elements that are readily identified in this model is a parallel β sheet built by strands C, D, and E, with helices 2 and 3 connecting strands (C + D) and (D + E), respectively, and an analogous α–β unit (strand G and helix 7) toward the end of the sequence. The resemblance of the supersecondary model to the tertiary structure formed by 8-fold α–β barrel proteins is almost certainly not coincidental.