Toward General Methods of Targeted Library Design: Topomer Shape Similarity Searching with Diverse Structures as Queries

Abstract
A promising strategy for selecting synthetic targets is similarity-based searching of very large “virtual libraries”, which comprise all structures accessible by linking two or three commercially available building blocks with combinatorial syntheses. To assess the general applicability of this strategy, leading structures taken from each of 34 recent medicinal chemistry publications were used as queries to search a virtual library containing 2.6 × 1013 products from seven reactions, using a topomer shape similarity metric. Eighty-five percent of these searches succeeded, by yielding, with a search radius no greater than 120 topomer shape units, either at least 400 hits or hits from at least six sublibraries. From these 34 sets of search results, 122 representative structures were selected, illustrating potential “lead hops”, or otherwise novel structures. Overall shape similarity to the query structure was confirmed for up to 95% of these representative structures, according to FLEXS, an algorithmically distinct program. Experimentally, there were 28 structures among those reported in the 34 query publications that were identified within the virtual library. Among these, the frequency of high activity was 87% for the 16 structures whose similarity to their query was 90 topomer units or less, compared to a frequency of 50% for the other 12 structures.

This publication has 32 references indexed in Scilit: