The Relationship between the Sequence Identities of Alpha Helical Proteins in the PDB and the Molecular Similarities of Their Ligands
- 1 November 2001
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Computer Sciences
- Vol. 41 (6) , 1617-1622
- https://doi.org/10.1021/ci010364q
Abstract
This paper considers the relationship between the percentage sequence identities of protein chains and the molecular similarities of the ligands they bind. Among a set of alpha helical proteins from the PDB, it is found that related proteins tend to bind similar ligands. Furthermore, the property of binding similar ligands can be used to define the categories of “like” and “unlike” pairs of protein chains, separated by an approximate cutoff at a sequence identity of, or somewhat above, 45%. Similarly, the property of binding related protein chains can be used to define “low” and “high” similarity pairs of ligand residues, with a cutoff at a Tanimoto score of 0.70. The ligands bound to two “like” protein chains are five times more likely to be of high similarity than would be expected if protein sequence identity and ligand molecular similarity were independent variables. Nonetheless, the nature of the PDB means that it is unclear whether the same conclusions would be reached with a data set representing an unbiased sample of all protein−ligand complexes in a living cell. The construction of an appropriate data set for such a study represents a significant challenge.Keywords
This publication has 10 references indexed in Scilit:
- Fast assignment of protein structures to sequences using the Intermediate Sequence Library PDB-ISLBioinformatics, 2000
- The Protein Data BankNucleic Acids Research, 2000
- The relationship between protein structure and function: a comprehensive survey with application to the yeast genomeJournal of Molecular Biology, 1999
- Sequences annotated by structure: a tool to facilitate the use of structural information in sequence analysisProtein Engineering, Design and Selection, 1998
- On the Properties of Bit String-Based Measures of Chemical SimilarityJournal of Chemical Information and Computer Sciences, 1998
- PDBsum: a web-based database of summaries and analyses of all PDB structuresTrends in Biochemical Sciences, 1997
- CATH – a hierarchic classification of protein domain structuresPublished by Elsevier ,1997
- Selecting Optimally Diverse Compounds from Structure Databases: A Validation Study of Two-Dimensional and Three-Dimensional Molecular DescriptorsJournal of Medicinal Chemistry, 1997
- Neighborhood Behavior: A Useful Concept for Validation of “Molecular Diversity” DescriptorsJournal of Medicinal Chemistry, 1996
- Identification and classification of protein fold familiesProtein Engineering, Design and Selection, 1993