Determining the Validity of a QSAR Model − A Classification Approach
- 24 November 2004
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Modeling
- Vol. 45 (1) , 65-73
- https://doi.org/10.1021/ci0497511
Abstract
The determination of the validity of a QSAR model when applied to new compounds is an important concern in the field of QSAR and QSPR modeling. Various scoring techniques can be applied to specific types of models. We present a technique with which we can state whether a new compound will be well predicted by a previously built QSAR model. In this study we focus on linear regression models only, though the technique is general and could also be applied to other types of quantitative models. Our technique is based on a classification method that divides regression residuals from a previously generated model into a good class and bad class and then builds a classifier based on this division. The trained classifier is then used to determine the class of the residual for a new compound. We investigated the performance of a variety of classifiers, both linear and nonlinear. The technique was tested on two data sets from the literature and a hand built data set. The data sets selected covered both physical and biological properties and also presented the methodology with quantitative regression models of varying quality. The results indicate that this technique can determine whether a new compound will be well or poorly predicted with weighted success rates ranging from 73% to 94% for the best classifier.Keywords
This publication has 9 references indexed in Scilit:
- Feature selection and transduction for prediction of molecular bioactivity for drug designBioinformatics, 2003
- Pattern Recognition and Neural NetworksPublished by Cambridge University Press (CUP) ,1996
- Automated Descriptor Selection for Quantitative Structure-Activity Relationships Using Generalized Simulated AnnealingJournal of Chemical Information and Computer Sciences, 1995
- Development and use of charged partial surface area structural descriptors in computer-assisted quantitative structure-property relationship studiesAnalytical Chemistry, 1990
- Molecular shape and the prediction of high-performance liquid chromatographic retention indexes of polycyclic aromatic hydrocarbonsAnalytical Chemistry, 1987
- A simple method for the representation, quantification, and comparison of the volumes and shapes of chemical compoundsJournal of Chemical Information and Computer Sciences, 1986
- Atom pairs as molecular features in structure-activity studies: definition and applicationsJournal of Chemical Information and Computer Sciences, 1985
- On molecular identification numbersJournal of Chemical Information and Computer Sciences, 1984
- Equation of State Calculations by Fast Computing MachinesThe Journal of Chemical Physics, 1953