Fast detection of common geometric substructure in proteins
- 1 April 1999
- conference paper
- Published by Association for Computing Machinery (ACM)
- Vol. 6, 104-113
- https://doi.org/10.1145/299432.299464
Abstract
We consider the problem of identifying common three-dimensional substructures between proteins. Our method is based on comparing the shape of the $\alpha$-carbon backbone structures of the proteins in order to find 3D rigid motions that bring portions of the geometric structures into correspondence. We propose a geometric representation of protein backbone chains that is compact yet allows for similarity measures that are robust against noise and outliers. We represents the structure of the backbone as a sequence of unit vectors, defined by each adjacent pair of $\alpha$-carbons; we then define a measure of the similarity of two protein structures based on the RMS (root mean squared) distance between corresponding orientation vectors in the two proteins. Our measure has several advantages over standard position-based RMS measures that are commonly used for comparing protein shapes. In particular, the measure behaves well for comparing substructures, because unlike position-based measures the nonmatching portions of the structure do not dominate the measure. At the same time, it avoids the quadratic space and computational difficulties associated with the use of distance matrices and contact maps. We show applications of our approach to detecting common contiguous substructures in pairs of proteins, as well as the more difficult problem of identifying common protein domains (i.e., larger substructures that are not necessarily contiguous along the protein chain).Keywords
This publication has 20 references indexed in Scilit:
- Protein Structure Comparison by Alignment of Distance MatricesJournal of Molecular Biology, 1993
- Families and the structural relatedness among globular proteinsProtein Science, 1993
- Identification of Tertiary Structure Resemblance in Proteins Using a Maximal Common Subgraph Isomorphism AlgorithmJournal of Molecular Biology, 1993
- Definition of general topological equivalence in protein structuresJournal of Molecular Biology, 1990
- Protein structure alignmentJournal of Molecular Biology, 1989
- A toolkit for computational molecular biology. II. On the optimal superposition of two sets of coordinatesActa Crystallographica Section A Foundations of Crystallography, 1986
- A simplified representation of protein conformations for rapid simulation of protein foldingJournal of Molecular Biology, 1976
- Some new methods and general results of analysis of protein crystallographic structural dataJournal of Molecular Biology, 1975
- Comparison of homologous tertiary structures of proteinsJournal of Theoretical Biology, 1974
- Comparison of super-secondary structures in proteinsJournal of Molecular Biology, 1973