Approximation of Protein Structure for Fast Similarity Measures
- 1 March 2004
- journal article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 11 (2-3) , 299-317
- https://doi.org/10.1089/1066527041410355
Abstract
The structural comparison of two proteins comes up in many applications in structural biology where it is often necessary to find similarities in very large conformation sets. This work describes techniques to achieve significant speedup in the computation of structural similarity between two given conformations, at the expense of introducing a small error in the similarity measure. Furthermore, the proposed computational scheme allows for a tradeoff between speedup and error. This scheme exploits the fact that the Calpha representation of a protein conformation contains redundant information, due to the chain topology and limited compactness of proteins. This redundancy can be reduced by approximating subchains of a protein by their centers of mass, resulting in a smaller number of points to describe a conformation. A Haar wavelet analysis of random chains and proteins is used to justify this approximated representation. Similarity measures computed with this representation are highly correlated to the measures computed with the original Calpha representation. Therefore, they can be used in applications where small similarity errors can be tolerated or as fast filters in applications that require exact measures. Computational tests have been conducted on two applications, nearest neighbor search and automatic structural classification.Keywords
This publication has 33 references indexed in Scilit:
- Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular MotionJournal of Computational Biology, 2003
- A Novel Method for Sampling Alpha-helical Protein BackbonesJournal of Molecular Biology, 2001
- A fast method to sample real protein conformational spaceProteins-Structure Function and Bioinformatics, 2000
- The Protein Data BankNucleic Acids Research, 2000
- Comprehensive assessment of automatic structural alignment against a manual standard, the scop classification of proteinsProtein Science, 1998
- The structural alignment between two proteins: Is there a unique answer?Protein Science, 1996
- Surprising similarities in structure comparisonCurrent Opinion in Structural Biology, 1996
- Optimum superimposition of protein structures: ambiguities and implicationsFolding and Design, 1996
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977
- Multidimensional binary search trees used for associative searchingCommunications of the ACM, 1975