Prediction of Aqueous Solubility of a Diverse Set of Compounds Using Quantitative Structure−Property Relationships
- 23 July 2003
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Medicinal Chemistry
- Vol. 46 (17) , 3572-3580
- https://doi.org/10.1021/jm020266b
Abstract
“Fail early and fail fast” is the current paradigm that the pharmaceutical industry has adopted widely. Removing non-drug-like compounds from the drug discovery lifecycle in the early stages can lead to tremendous savings of resources. Thus, fast screening methods are needed to profile the large collection of synthesized and virtual libraries involved in the early stage. Solubility is one of the filters that are applied extensively to ensure that the compounds are reasonably soluble so that synthesis of the compounds and assay studies of pharmacokinetics and toxicity are feasible. To address this need, we have developed a fast quantitative structure−property relationship (QSPR) model for the prediction of aqueous solubility (at 298 K, unbuffered solution) from the molecular structures. Multiple linear regressions and genetic algorithms were used to develop the models. The model was based on a set of diverse compounds including small organic molecules and drug and drug-like species. The predicted solubility for the training and test sets agrees well with the experimental values. The coefficient of determination is R2 = 0.84 for the training set of 775 compounds and the RMS error = 0.87. This model was validated on four sets of compounds. The RMS error for the 1665 compounds from the four validation data sets (including compounds from the Physician's Desk References and Comprehensive Medicinal Chemistry databases) is 1 log unit and the unsigned error is 0.77. This model does not require 3-D structure generation which is rather time-consuming. Using 2-D structure as input, this model is able to compute solubility for 90 000−700 000 compounds/h on a SGI Origin 2000 workstation. This kind of fast calculation allows the model to be used in data mining and screening of large synthesized or virtual libraries.Keywords
This publication has 23 references indexed in Scilit:
- One-Dimensional Molecular Representations and Similarity Calculations: Methodology and ValidationJournal of Medicinal Chemistry, 2001
- Prediction of Aqueous Solubility of Heteroatom-Containing Organic Compounds from Molecular StructureJournal of Chemical Information and Computer Sciences, 2001
- Property-Based Design: Optimization of Drug Absorption and PharmacokineticsJournal of Medicinal Chemistry, 2001
- Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings 1PII of original article: S0169-409X(96)00423-1. The article was originally published in Advanced Drug Delivery Reviews 23 (1997) 3–25. 1Advanced Drug Delivery Reviews, 2001
- Computational methods for the prediction of ‘drug-likeness’Published by Elsevier ,2000
- Correlation of the Aqueous Solubility of Hydrocarbons and Halogenated Hydrocarbons with Molecular StructureJournal of Chemical Information and Computer Sciences, 1998
- Prediction of Aqueous Solubility for a Diverse Set of Heteroatom-Containing Organic Compounds Using a Quantitative Structure−Property RelationshipJournal of Chemical Information and Computer Sciences, 1996
- Success rates for new drugs entering clinical testing in the United StatesClinical Pharmacology & Therapeutics, 1995
- Prediction of Aqueous Solubility of Organic CompoundsJournal of Chemical Information and Computer Sciences, 1994
- Pharmaceutical innovation by the seven UK‐owned pharmaceutical companies (1964‐1985).British Journal of Clinical Pharmacology, 1988