A new bioinformatic approach to detect common 3D sites in protein structures
- 3 June 2003
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 52 (2) , 137-145
- https://doi.org/10.1002/prot.10339
Abstract
An innovative bioinformatic method has been designed and implemented to detect similar three‐dimensional (3D) sites in proteins. This approach allows the comparison of protein structures or substructures and detects local spatial similarities: this method is completely independent from the amino acid sequence and from the backbone structure. In contrast to already existing tools, the basis for this method is a representation of the protein structure by a set of stereochemical groups that are defined independently from the notion of amino acid. An efficient heuristic for finding similarities that uses graphs of triangles of chemical groups to represent the protein structures has been developed. The implementation of this heuristic constitutes a software named SuMo (Surfing the Molecules), which allows the dynamic definition of chemical groups, the selection of sites in the proteins, and the management and screening of databases. To show the relevance of this approach, we focused on two extreme examples illustrating convergent and divergent evolution. In two unrelated serine proteases, SuMo detects one common site, which corresponds to the catalytic triad. In the legume lectins family composed of >100 structures that share similar sequences and folds but may have lost their ability to bind a carbohydrate molecule, SuMo discriminates between functional and non‐functional lectins with a selectivity of 96%. The time needed for searching a given site in a protein structure is typically 0.1 s on a PIII 800MHz/Linux computer; thus, in further studies, SuMo will be used to screen the PDB. Proteins 2003;52:137–145.Keywords
Funding Information
- French Ministère de la Recherche
This publication has 23 references indexed in Scilit:
- Protein surface similarities: a survey of methods to describe and compare protein surfacesCellular and Molecular Life Sciences, 2000
- Touring protein fold space with Dali/FSSPNucleic Acids Research, 1998
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- Molecular surface representations by sparse critical pointsProteins-Structure Function and Bioinformatics, 1994
- Protein Structure Comparison by Alignment of Distance MatricesJournal of Molecular Biology, 1993
- An Efficient Automated Computer Vision Based Technique for Detection of Three Dimensional Structural Motifs in ProteinsJournal of Biomolecular Structure and Dynamics, 1992
- Prosite: a dictionary of sites and patterns in proteinsNucleic Acids Research, 1991
- [5] Rapid and sensitive sequence comparison with FASTP and FASTAPublished by Elsevier ,1990
- Profile analysis: detection of distantly related proteins.Proceedings of the National Academy of Sciences, 1987