Recursive Median Partitioning for Virtual Screening of Large Databases
- 1 January 2003
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Computer Sciences
- Vol. 43 (1) , 182-188
- https://doi.org/10.1021/ci0203848
Abstract
Recently, we have introduced the median partitioning (MP) method for diversity selection and compound classification. The MP approach utilizes property descriptors with continuous value ranges, transforms these descriptors into a binary classification scheme by determining their medians in source databases, and divides database molecules in subsequent steps into populations above or below these medians. Having previously demonstrated the usefulness of MP for the classification of molecules according to biological activity, we have now gone a step further and extended the methodology for application in virtual screening. In these calculations, a series of bait molecules having desired activity is added to large compound databases, and subsequent iterations or recursions are carried out to reduce the number of candidate molecules until a small number of compounds are found in partitions enriched with bait molecules. For each recursion step, descriptor combinations are identified that copartition as many active molecules as possible. Descriptor selection is facilitated by application of a genetic algorithm (GA). The recursive MP approach (RMP) has been applied to five diverse biological activity classes in virtual screening of a database consisting of approximately 1.34 million molecules to which different types of active compounds were added. RMP analysis produced hit rates of up to 21%, dependent on the biological activity class, and led to an average approximately 3600-fold improvement over random selection for the activity classes that were used as test cases.Keywords
This publication has 9 references indexed in Scilit:
- Molecular Docking and High-Throughput Screening for Novel Inhibitors of Protein Tyrosine Phosphatase-1BJournal of Medicinal Chemistry, 2002
- On Combining Recursive Partitioning and Simulated Annealing To Detect Groups of Biologically Active CompoundsJournal of Chemical Information and Computer Sciences, 2002
- Prediction of Biological Activity for High-Throughput Screening Using Binary Kernel DiscriminationJournal of Chemical Information and Computer Sciences, 2001
- A widely applicable set of descriptorsPublished by Elsevier ,2000
- Statistical Methods in Analytical ChemistryPublished by Wiley ,2000
- Analysis of a Large Structure/Biological Activity Data Set Using Recursive PartitioningJournal of Chemical Information and Computer Sciences, 1999
- Dissimilarity-Based Algorithms for Selecting Structurally Diverse Sets of CompoundsJournal of Computational Biology, 1999
- Metric Validation and the Receptor-Relevant Subspace ConceptJournal of Chemical Information and Computer Sciences, 1999
- Comparison of algorithms for dissimilarity-based compound selectionJournal of Molecular Graphics and Modelling, 1997