An Investigation of Clustering as a Tool in Quantitative Structure-Activity Relationships (QSARS)

1 November 1995

journal article
research article
Published by Taylor & Francis in SAR and QSAR in Environmental Research

Vol. 4 (1) , 1-10
https://doi.org/10.1080/10629369508234009

Abstract

By means of clustering, one is able to manage large databases easily. Clustering according to structure similarity distinguished the several chemical classes that were present in our training set. All the clusters showed correlation of log WS with log K _OW and melting point, except EINECS-cluster 1. This cluster contains only chemicals with melting points below room temperature, resulting in a log WS-log K_OW, relationship. The observed weak correlation for this cluster is probably due to the insufficient number of available screens. Such a limited amount of screens allows relatively very different chemicals to share the same cluster. Using statistical criteria, our approach resulted in three QSARs with reasonably good predictive capabilities, originating from clusters 1639, 3472, and 5830. The models resulting from the smaller clusters 6873, 8154, and 16424 are characterised by high correlation coefficients which describe the cluster itself very well but, due to our stringent bootstrap criterion, they are close to randomness. Clusters 6815 and 18083 showed rather low correlations. The models originating from clusters 1639, 3472, and 5830 proved their usefulness by external validation. The log WS-values calculated with our QSARs agreed within 1 log-unit to these reported in the literature.

Keywords

This publication has 9 references indexed in Scilit:

The Use of Similarity and Clustering Techniques for the Prediction of Molecular Properties
Published by Springer Nature ,1991
Aqueous solubility and n-octanol/water partition coefficient correlations
Chemosphere, 1989
Jackknife, Bootstrap and Other Resampling Methods in Regression Analysis
The Annals of Statistics, 1986
Theoretical and experimental relationships between soil adsorption, octanol-water partition coefficients, water solubilities, bioconcentration factors, and the parachor
Journal of Agricultural and Food Chemistry, 1981
Solubility and partitioning I: Solubility of nonelectrolytes in water
Journal of Pharmaceutical Sciences, 1980
Relationships between aqueous solubility and octanol-water partition coefficients
Chemosphere, 1980
Environmental fate of selected phosphate esters
Environmental Science & Technology, 1979
Clustering Using a Similarity Measure Based on Shared Near Neighbors
IEEE Transactions on Computers, 1973
Linear free-energy relationship between partition coefficients and the aqueous solubility of organic liquids
The Journal of Organic Chemistry, 1968