Statistical Molecular Design of Building Blocks for Combinatorial Chemistry
- 8 March 2000
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Medicinal Chemistry
- Vol. 43 (7) , 1320-1328
- https://doi.org/10.1021/jm991118x
Abstract
The reduction of the size of a combinatorial library can be made in two ways, either base the selection on the building blocks (BB's) or base it on the full set of virtually constructed products. In this paper we have investigated the effects of applying statistical designs to BB sets compared to selections based on the final products. The two sets of BB's and the virtually constructed library were described by structural parameters, and the correlation between the two characterizations was investigated. Three different selection approaches were used both for the BB sets and for the products. In the first two the selection algorithms were applied directly to the data sets (D-optimal design and space-filling design), while for the third a cluster analysis preceded the selection (cluster-based design). The selections were compared using visual inspection, the Tanimoto coefficient, the Euclidean distance, the condition number, and the determinant of the resulting data matrix. No difference in efficiency was found between selections made in the BB space and in the product space. However, it is of critical importance to investigate the BB space carefully and to select an appropriate number of BB's to result in an adequate diversity. An example from the pharmaceutical industry is then presented, where selection via BB's was made using a cluster-based design.Keywords
This publication has 12 references indexed in Scilit:
- Approaches to the design of combinatorial librariesChemometrics and Intelligent Laboratory Systems, 1999
- Fuzzy clustering of 627 alcohols, guided by a strategy for cluster analysis of chemical compounds for combinatorial chemistryChemometrics and Intelligent Laboratory Systems, 1998
- Random or Rational Design? Evaluation of Diverse Compound Subsets from Chemical Structure DatabasesJournal of Medicinal Chemistry, 1998
- The Effectiveness of Reactant Pools for Generating Structurally-Diverse Combinatorial LibrariesJournal of Chemical Information and Computer Sciences, 1997
- Cluster-based Design in Environmental QSARQuantitative Structure-Activity Relationships, 1997
- A Fast Algorithm For Selecting Sets Of Dissimilar Molecules From Large Chemical DatabasesQuantitative Structure-Activity Relationships, 1995
- A new algorithm for optimal, distance-based experimental designChemometrics and Intelligent Laboratory Systems, 1992
- PLS regression methodsJournal of Chemometrics, 1988
- Analysis of two partial-least-squares algorithms for multivariate calibrationChemometrics and Intelligent Laboratory Systems, 1987
- The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized InversesSIAM Journal on Scientific and Statistical Computing, 1984