Molecular Similarity Analysis and Virtual Screening by Mapping of Consensus Positions in Binary-Transformed Chemical Descriptor Spaces with Variable Dimensionality
- 8 November 2003
- journal article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Computer Sciences
- Vol. 44 (1) , 21-29
- https://doi.org/10.1021/ci0302963
Abstract
A novel compound classification algorithm is described that operates in binary molecular descriptor spaces and groups active compounds together in a computationally highly efficient manner. The method involves the transformation of continuous descriptor value ranges into a binary format, subsequent definition of simplified descriptor spaces, identification of consensus positions of specific compound sets in these spaces, and iterative adjustments of the dimensionality of the descriptor spaces in order to discriminate compounds sharing similar activity from others. We term this approach Dynamic Mapping of Consensus positions (DMC) because the definition of reference spaces is tuned toward specific compound classes and their dimensionality is increased as the analysis proceeds. When applied to virtual screening, sets of bait compounds are added to a large screening database to identify hidden active molecules. In these calculations, molecules that map to consensus positions after elimination of most of the database compounds are considered hit candidates. In a benchmark study on five biological activity classes, hits for randomly assembled sets of bait molecules were correctly identified in 95% of virtual screening calculations in a source database containing more than 1.3 million molecules, thus providing a measure of the sensitivity of the DMC technique.Keywords
This publication has 13 references indexed in Scilit:
- Recursive Median Partitioning for Virtual Screening of Large DatabasesJournal of Chemical Information and Computer Sciences, 2003
- Integration of virtual and high-throughput screeningNature Reviews Drug Discovery, 2002
- Selected Concepts and Investigations in Compound Classification, Molecular Descriptor Analysis, and Virtual ScreeningJournal of Chemical Information and Computer Sciences, 2001
- Optimization of chemical libraries by neural networksCurrent Opinion in Chemical Biology, 2000
- The Characterization of Chemical Structures Using Molecular Properties. A SurveyJournal of Chemical Information and Computer Sciences, 1999
- Analysis of a Large Structure/Biological Activity Data Set Using Recursive PartitioningJournal of Chemical Information and Computer Sciences, 1999
- Metric Validation and the Receptor-Relevant Subspace ConceptJournal of Chemical Information and Computer Sciences, 1999
- Construction of 3D-QSAR Models Using the 4D-QSAR Analysis FormalismJournal of the American Chemical Society, 1997
- Use of Structure−Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound SelectionJournal of Chemical Information and Computer Sciences, 1996
- Computer storage and retrieval of generic chemical structures in patents. 7. Parallel simulation of a relaxation algorithm for chemical substructure searchJournal of Chemical Information and Computer Sciences, 1986