Performance of Similarity Measures in 2D Fragment-Based Similarity Searching: Comparison of Structural Descriptors and Similarity Coefficients
- 15 October 2002
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Computer Sciences
- Vol. 42 (6) , 1407-1414
- https://doi.org/10.1021/ci025531g
Abstract
2D fragment-based similarity searching is one of the most popular techniques for searching a large database of chemical structures and has been widely applied in drug discovery. However, its performance, especially its effectiveness in retrieving active structural analogues, has not been adequately studied. We report a series of computational experiments, where we systematically studied the influence of structural descriptors and similarity coefficients on the effectiveness of similarity searching. The study was conducted using two public large data sets, NCI anti-AIDS and MDDR. Four sets of 2D linear fragment descriptors, based on the original definitions of atom pairs and atom sequences, were compared. The effect of using the Tanimoto coefficient and the Euclidean distance was studied as a function of descriptor set. The results clearly indicate that the Tanimoto coefficient is superior to the Euclidean distance in 2D-fragment based similarity searching, in terms of hit rate, while atom sequences demonstrate the best overall performance among the structural descriptors we studied.Keywords
This publication has 12 references indexed in Scilit:
- Chemical Similarity SearchingJournal of Chemical Information and Computer Sciences, 1998
- On the Properties of Bit String-Based Measures of Chemical SimilarityJournal of Chemical Information and Computer Sciences, 1998
- Potent and Selective 1,2,3-Trisubstituted Indole NPY Y-1 AntagonistsJournal of Medicinal Chemistry, 1997
- Use of Structure−Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound SelectionJournal of Chemical Information and Computer Sciences, 1996
- 3-[[4-(4-Chlorophenyl)piperazin-1-yl]methyl]-1H-pyrrolo[2,3-b]pyridine: An Antagonist with High Affinity and Selectivity for the Human Dopamine D4 ReceptorJournal of Medicinal Chemistry, 1996
- Neighborhood Behavior: A Useful Concept for Validation of “Molecular Diversity” DescriptorsJournal of Medicinal Chemistry, 1996
- New Soluble-Formazan Assay for HIV-1 Cytopathic Effects: Application to High-Flux Screening of Synthetic and Natural Products for AIDS-Antiviral ActivityJNCI Journal of the National Cancer Institute, 1989
- A Comparison of Some Measures for the Determination of Inter‐Molecular Structural Similarity Measures of Inter‐Molecular Structural SimilarityQuantitative Structure-Activity Relationships, 1986
- Atom pairs as molecular features in structure-activity studies: definition and applicationsJournal of Chemical Information and Computer Sciences, 1985
- A Linked-Path Connection Table with Substructural Atom-OrderingJournal of Chemical Information and Computer Sciences, 1979