CORA—Topological fingerprints for protein structural families
- 1 January 1999
- journal article
- Published by Wiley in Protein Science
- Vol. 8 (4) , 699-715
- https://doi.org/10.1110/ps.8.4.699
Abstract
CORA is a suite of programs for multiply aligning and analyzing protein structural families to identify the consensus positions and capture their most conserved structural characteristics (e.g., residue accessibility, torsional angles, and global geometry as described by inter-residue vectors/contacts). Knowledge of these structurally conserved positions, which are mostly in the core of the fold and of their properties, significantly improves the identification and classification of newly-determined relatives. Information is encoded in a consensus three-dimensional (3D) template and relatives found by a sensitive alignment method, which employs a new scoring scheme based on conserved residue contacts. By encapsulating these critical "core" features, templates perform more reliably in recognizing distant structural relatives than searches with representative structures. Parameters for 3D-template generation and alignment were optimized for each structural class (mainly-alpha, mainly-beta, alpha-beta), using representative superfold families. For all families selected, the templates gave significant improvements in sensitivity and selectivity in recognizing distant structural relatives. Furthermore, since templates contain less than 70% of fold positions and compare fewer positions when aligning structures, scans are at least an order of magnitude faster than scans using selected structures. CORA was subsequently tested on eight other broad structural families from the CATH database. Diagnostics plots are generated automatically and provide qualitative assistance for classifying newly determined relatives. They are demonstrated here by application to the large globin-like fold family. CORA templates for both homologous superfamilies and fold families will be stored in CATH and used to improve the classification and analysis of newly determined structures.Keywords
This publication has 43 references indexed in Scilit:
- CATH – a hierarchic classification of protein domain structuresPublished by Elsevier ,1997
- Average Core Structures and Variability Measures for Protein Families: Application to the ImmunoglobulinsJournal of Molecular Biology, 1995
- Methods for displaying macromolecular structural uncertainty: Application to the globinsJournal of Molecular Graphics, 1995
- Comparative Protein Modelling by Satisfaction of Spatial RestraintsJournal of Molecular Biology, 1993
- Alignment and Searching for Common Protein Folds Using a Data Bank of Structural TemplatesJournal of Molecular Biology, 1993
- A rapid method of protein structure alignmentJournal of Theoretical Biology, 1990
- Definition of general topological equivalence in protein structuresJournal of Molecular Biology, 1990
- Flexible protein sequence patternsJournal of Molecular Biology, 1990
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983
- How different amino acid sequences determine similar protein structures: The structure and evolutionary dynamics of the globinsJournal of Molecular Biology, 1980