Local feature frequency profile: A method to measure structural similarity in proteins
Open Access
- 25 February 2004
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 101 (11) , 3797-3802
- https://doi.org/10.1073/pnas.0308656100
Abstract
Measures of structural similarity between known protein structures provide an objective basis for classifying protein folds and for revealing a global view of the protein structure universe. Here, we describe a rapid method to measure structural similarity based on the profiles of representative local features of Cα distance matrices of compared protein structures. We first extract a finite number of representative local feature (LF) patterns from the distance matrices of all protein fold families by medoid analysis. Then, each Cα distance matrix of a protein structure is encoded by labeling all its submatrices by the index of the nearest representative LF patterns. Finally, the structure is represented by the frequency distribution of these indices, which we call the LF frequency (LFF) profile of the protein. The LFF profile allows one to calculate structural similarity scores among a large number of protein structures quickly, and also to construct and update the “map” of the protein structure universe easily. The LFF profile method efficiently maps complex protein structures into a common Euclidean space without prior assignment of secondary structure information or structural alignment.Keywords
This publication has 22 references indexed in Scilit:
- SCOP: A structural classification of proteins database for the investigation of sequences and structuresPublished by Elsevier ,2006
- Tapping DNA for Structures Produces a TrickleScience, 2002
- Protein fold similarity estimated by a probabilistic approach based on C α -C α distance comparison 1 1Edited by B. HonigJournal of Molecular Biology, 2002
- The Protein Data BankNucleic Acids Research, 2000
- Recovery of protein structure from contact mapsFolding and Design, 1997
- CATH – a hierarchic classification of protein domain structuresPublished by Elsevier ,1997
- Protein Structure Comparison by Alignment of Distance MatricesJournal of Molecular Biology, 1993
- The combinatorial distance geometry method for the calculation of molecular conformation. I. A new approach to an old problemJournal of Theoretical Biology, 1983
- Medium- and Long-Range Interaction Parameters between Amino Acids for Predicting Three-Dimensional Structures of ProteinsMacromolecules, 1976
- The biplot graphic display of matrices with application to principal component analysisBiometrika, 1971