Prediction of super-secondary structure in proteins

Abstract
Various methods for the prediction of secondary structure from amino acid sequence can consistently achieve on average 60% accuracy when tested for several proteins1–3. Improvement on this value has proved difficult, despite increasing the size of the data set and refining predictive techniques4. The difficulty almost certainly derives from the influence of long-range interactions and the restrictions required to attain favourable protein topologies. We describe here a novel approach to structure prediction from amino acid sequence based on the recognition of super-secondary structure. The structure we initially consider is the βαβ unit, which consists of two parallel β-strands connected by an α-helix. From an analysis of all known βαβ units, an ideal secondary structure sequence was derived. This was used as a template to locate probable βαβ sequences in a standard secondary structure prediction. The method correctly predicted the location of 70% of the βαβ units in 16 β/α type proteins. This led to a 7.5% average improvement over the original secondary structure prediction.