Molecular Hashkeys: A Novel Method for Molecular Characterization and Its Application for Predicting Important Pharmaceutical Properties of Molecules
- 24 April 1999
- journal article
- Published by American Chemical Society (ACS) in Journal of Medicinal Chemistry
- Vol. 42 (10) , 1739-1748
- https://doi.org/10.1021/jm980527a
Abstract
We define a novel numerical molecular representation, called the molecular hashkey, that captures sufficient information about a molecule to predict pharmaceutically interesting properties directly from three-dimensional molecular structure. The molecular hashkey represents molecular surface properties as a linear array of pairwise surface-based comparisons of the target molecule against a common ‘basis-set' of molecules. Hashkey-measured molecular similarity correlates well with direct methods of measuring molecular surface similarity. Using a simple machine-learning technique with the molecular hashkeys, we show that it is possible to accurately predict the octanol−water partition coefficient, log P. Using more sophisticated learning techniques, we show that an accurate model of intestinal absorption for a set of drugs can be constructed using the same hashkeys used in the aforementioned experiments. Once a set of molecular hashkeys is calculated, its use in the training and testing of property-based models is very fast. Further, the required amount of data for model construction is very small. Neural network-based hashkey models trained on data sets as small as 30 molecules yield statistically significant prediction of molecular properties. The lack of a requirement for large data sets lends itself well to the prediction of pharmaceutically relevant molecular parameters for which data generation is expensive and slow. Molecular hashkeys coupled with machine-learning techniques can yield models that predict key pharmacological aspects of biologically important molecules and should therefore be important in the design of effective therapeutics.Keywords
This publication has 12 references indexed in Scilit:
- Virtual screening—an overviewDrug Discovery Today, 1998
- The Information Content of 2D and 3D Structural Descriptors Relevant to Ligand-Receptor BindingJournal of Chemical Information and Computer Sciences, 1997
- Molecular Similarity Based on DOCK-Generated FingerprintsJournal of Medicinal Chemistry, 1996
- Structure-Based Design of Lipophilic Quinazoline Inhibitors of Thymidylate SynthaseJournal of Medicinal Chemistry, 1996
- Neighborhood Behavior: A Useful Concept for Validation of “Molecular Diversity” DescriptorsJournal of Medicinal Chemistry, 1996
- Prediction of Aqueous Solubility for a Diverse Set of Heteroatom-Containing Organic Compounds Using a Quantitative Structure−Property RelationshipJournal of Chemical Information and Computer Sciences, 1996
- Predicting ligand binding to proteins by affinity fingerprintingChemistry & Biology, 1995
- Structure-Based Discovery of Inhibitors of Thymidylate SynthaseScience, 1993
- Inhibition of the fusion-inducing conformational change of influenza hemagglutinin by benzoquinones and hydroquinonesBiochemistry, 1993
- Absorption Potential: Estimating the Fraction Absorbed for Orally Administered CompoundsJournal of Pharmaceutical Sciences, 1985