Functionally important segments in proteins dissected using Gene Ontology and geometric clustering of peptide fragments

Abstract
We have developed a geometric clustering algorithm using backbone φ,ψ angles to group conformationally similar peptide fragments of any length. By labeling each fragment in the cluster with the level-specific Gene Ontology 'molecular function' term of its protein, we are able to compute statistics for molecular function-propensity and p-value of individual fragments in the cluster. Clustering-cum-statistical analysis for peptide fragments 8 residues in length and with only trans peptide bonds shows that molecular function propensities ≥20 and p-values ≤0.05 can dissect fragments within a protein linked to the molecular function.