Abstract
An accurate and generally applicable method for estimating aqueous solubilities for a diverse set of 1297 organic compounds based on multilinear regression and artificial neural network modeling was developed. Molecular connectivity, shape, and atom-type electrotopological state (E-state) indices were used as structural parameters. The data set was divided into a training set of 884 compounds and a randomly chosen test set of 413 compounds. The structural parameters in a 30−12−1 artificial neural network included 24 atom-type E-state indices and six other topological indices, and for the test set, a predictive r2 = 0.92 and s = 0.60 were achieved. With the same parameters the statistics in the multilinear regression were r2 = 0.88 and s = 0.71, respectively.

This publication has 17 references indexed in Scilit: