Results of a New Classification Algorithm Combining K Nearest Neighbors and Recursive Partitioning
- 1 November 2000
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Computer Sciences
- Vol. 41 (1) , 168-175
- https://doi.org/10.1021/ci0003348
Abstract
We present results of a new computational learning algorithm combining favorable elements of two well-known techniques: K nearest neighbors and recursive partitioning. Like K nearest neighbors, the method provides an independent prediction for each test sample under consideration, while like recursive partitioning, it incorporates an automatic selection of important input variables for model construction. The new method is applied to the problem of correctly classifying a set of chemical data samples designated as being either active or inactive in a biological screen. Training is performed at varying levels of intrinsic model complexity, and classification performance is compared to that of both K nearest neighbor and recursive partitioning models trained using the identical protocol. We find that the cross-validated performance of the new method outperforms both of these standard techniques over a considerable range of user parameters. We discuss advantages and drawbacks of the new method, with particular emphasis on its parameter robustness, required training time, and performance with respect to chemical structural class.Keywords
This publication has 22 references indexed in Scilit:
- Analysis of a Large Structure/Biological Activity Data Set Using Recursive PartitioningJournal of Chemical Information and Computer Sciences, 1999
- Knowledge-Based Expert Systems for Toxicity and Metabolism Prediction: DEREK, StAR and METEORSAR and QSAR in Environmental Research, 1999
- Strategies toward predicting peptide cellular permeability from computed molecular descriptors.Chemical Biology & Drug Design, 1999
- Evaluation of Dynamic Polar Molecular Surface Area as Predictor of Drug Absorption: Comparison with Other Computational and Experimental PredictorsJournal of Medicinal Chemistry, 1998
- A Scoring Scheme for Discriminating between Drugs and NondrugsJournal of Medicinal Chemistry, 1998
- Can We Learn To Distinguish between “Drug-like” and “Nondrug-like” Molecules?Journal of Medicinal Chemistry, 1998
- Prediction of Human Intestinal Absorption of Drug Compounds from Molecular StructureJournal of Chemical Information and Computer Sciences, 1998
- Using Artificial Neural Networks to Predict Biological Activity from Simple Molecular Structural ConsiderationsQuantitative Structure-Activity Relationships, 1996
- Use of SAR in computer-assited prediction of carcinogenicity and mutagenicity of chemicals by the TOPKAT programMutation Research - Fundamental and Molecular Mechanisms of Mutagenesis, 1994
- The structural basis of the mutagenicity of chemicals in Salmonella typhimurium: The Gene-Tox Data BaseMutation Research - Fundamental and Molecular Mechanisms of Mutagenesis, 1990