Comparison of 2D Fingerprint Types and Hierarchy Level Selection Methods for Structural Grouping Using Ward's Clustering
- 23 November 1999
- journal article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Computer Sciences
- Vol. 40 (1) , 155-162
- https://doi.org/10.1021/ci990086j
Abstract
Four different two-dimensional fingerprint types (MACCS, Unity, BCI, and Daylight) and nine methods of selecting optimal cluster levels from the output of a hierarchical clustering algorithm were evaluated for their ability to select clusters that represent chemical series present in some typical examples of chemical compound data sets. The methods were evaluated using a Ward's clustering algorithm on subsets of the publicly available National Cancer Institute HIV data set, as well as with compounds from our corporate data set. We make a number of observations and recommendations about the choice of fingerprint type and cluster level selection methods for use in this type of clusteringKeywords
This publication has 12 references indexed in Scilit:
- Molecular Diversity and Representativity in Chemical DatabasesJournal of Chemical Information and Computer Sciences, 1998
- The Information Content of 2D and 3D Structural Descriptors Relevant to Ligand-Receptor BindingJournal of Chemical Information and Computer Sciences, 1997
- Use of Structure−Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound SelectionJournal of Chemical Information and Computer Sciences, 1996
- An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally related subfamiliesProtein Engineering, Design and Selection, 1996
- Clustering of chemical structures on the basis of two-dimensional similarity measuresJournal of Chemical Information and Computer Sciences, 1992
- New Soluble-Formazan Assay for HIV-1 Cytopathic Effects: Application to High-Flux Screening of Synthetic and Natural Products for AIDS-Antiviral ActivityJNCI Journal of the National Cancer Institute, 1989
- An Examination of the Effect of Six Types of Error Perturbation on Fifteen Clustering AlgorithmsPsychometrika, 1980
- A general statistical framework for assessing categorical clustering in free recall.Psychological Bulletin, 1976
- Measuring the Power of Hierarchical Cluster AnalysisJournal of the American Statistical Association, 1975
- Methods of Comparing ClassificationsAnnual Review of Ecology and Systematics, 1974