Efficient detection of three-dimensional structural motifs in biological macromolecules by computer vision techniques.
- 1 December 1991
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 88 (23) , 10495-10499
- https://doi.org/10.1073/pnas.88.23.10495
Abstract
Macromolecules carrying biological information often consist of independent modules containing recurring structural motifs. Detection of a specific structural motif within a protein (or DNA) aids in elucidating the role played by the protein (DNA element) and the mechanism of its operation. The number of crystallographically known structures at high resolution is increasing very rapidly. Yet, comparison of three-dimensional structures is a laborious time-consuming procedure that typically requires a manual phase. To date, there is no fast automated procedure for structural comparisons. We present an efficient O(n3) worst case time complexity algorithm for achieving such a goal (where n is the number of atoms in the examined structure). The method is truly three-dimensional, sequence-order-independent, and thus insensitive to gaps, insertions, or deletions. This algorithm is based on the geometric hashing paradigm, which was originally developed for object recognition problems in computer vision. It introduces an indexing approach based on transformation invariant representations and is especially geared toward efficient recognition of partial structures in rigid objects belonging to large data bases. This algorithm is suitable for quick scanning of structural data bases and will detect a recurring structural motif that is a priori unknown. The algorithm uses protein (or DNA) structures, atomic labels, and their three-dimensional coordinates. Additional information pertaining to the structure speeds the comparisons. The algorithm is straightforwardly parallelizable, and several versions of it for computer vision applications have been implemented on the massively parallel connection machine. A prototype version of the algorithm has been implemented and applied to the detection of substructures in proteins.Keywords
This publication has 23 references indexed in Scilit:
- Three-dimensional crystal structures of Escherichia coli met repressor with and without corepressorNature, 1989
- Action of leucine zippersNature, 1989
- Protein motifs and data-base searchingTrends in Biochemical Sciences, 1989
- Protein structure alignmentJournal of Molecular Biology, 1989
- Origin of DNA helical structure and its sequence dependenceBiochemistry, 1988
- The Leucine Zipper: A Hypothetical Structure Common to a New Class of DNA Binding ProteinsScience, 1988
- Knowledge based modelling of homologous proteins, part I: three-dimensional frameworks derived from the simultaneous superposition of multiple structuresProtein Engineering, Design and Selection, 1987
- Comparison of goose-type, chicken-type, and phage-type lysozymes illustrates the changes that occur in both amino acid sequence and three-dimensional structure during evolutionJournal of Molecular Evolution, 1985
- 3-Å resolution structure of a protein with histone-like properties in prokaryotesNature, 1984
- Exploring structural homology of proteinsJournal of Molecular Biology, 1976