Identification of “Latent Hits” in Compound Screening Collections Jordi Mestres* ,† and Gerrit H. Veeneman Computational Medicinal Chemistry, Organon Laboratories Ltd., Newhouse ML1 5SH, Scotland, U.K., and Lead Discovery Unit, N.V. Organon, 5340 BH, Oss, The Netherlands Received April 11, 2003 Abstract: The relatively low hit rates found from high- throughput screening have raised a question on whether this technology alone is sufficient to maximally exploit the full potential of current corporate screening collections. The present study introduces a knowledge-based strategy for identifying “latent hits”, i.e., inactive compounds that could potentially be promoted to hits through simple chemical transformations. Examples are given of submicromolar agonist hits derived from the corresponding latent hits for the estrogen receptor. Introduction. Over the past few years, the pharma- ceutical industry has relied heavily on high-throughput screening (HTS) technology for the identification of hits. Large investments and logistic efforts have been in setting up robotic equipments and in acquiring, assembling, and maintaining big corporate compound collections. However, alongside the established value of HTS as a means for identifying hits from screening collections, it has been widely recognized that the quantity, quality, and diversity of the hits identified have been below original expectations. 1 In addition, a number of important quality issues have emerged, namely, quality of the assays (leading often to poor quality of data, large numbers of false positives, and tedious active confirmation process), quality of the collection (purity and stability of compounds, as well as origin, composition, and representation of the collec- tion), and quality of the logistics (purchase, tracking, handling, and storage of compounds and data manage- ment). The combination of some or all of these issues is probably responsible for the relatively low hit rates found and has led to a significant increase of the cost of maintenance of HTS. Despite all efforts, everyone is aware that any corpo- rate screening library is intrinsically incomplete. While the estimated number of synthesizable compounds is on the order of 10 40 , corporate screening collections contain on the order of 10 6 compounds. It seems therefore naive to think that compounds possessing the optimal features arranged in an optimal way around a core structure to bind to our protein target of interest will habitually be present in our screening collection. As shown recently, the probability of a compound having the right key functional groups arranged optimally decreases dra- matically as the complexity of the molecule increases. 2 In contrast, it seems likely that compounds having almost the right features arranged almost optimally around a core structure will be indeed present in compound libraries. The subtle differences between the optimal and the almost optimal presence and arrange- ment of the key structural features in a compound may ultimately result in detection of activity for that com- pound in a HTS assay. Part of the problem lies in the fact that a significant number of compounds present in corporate screening collections were originally acquired from external chemi- cal suppliers with the aim of covering as much chemical space as possible within the size of the compound collection. Consequently, compound selections were mainly directed by strict diversity criteria, which meant that in most instances only the centroid compound of a cluster containing many structurally similar compounds was selected for purchase. Correspondingly, the chances of that centroid compound having the optimal features arranged in the optimal way are very slim. As recently shown, this centroid selection strategy may potentially lead to a 70% chance that the activity within the cluster will not be discovered. 3 As a result of all these factors, realistically, by just screening, we seldom find optimal highly active com- pounds but at best suboptimal low active compounds that still require going through extensive and expensive optimization programs. However, an unknown number of nonoptimal inactive compounds that could potentially be promoted to active compounds by means of simple chemical transformations remains often entirely over- looked after completion of an HTS campaign. These “potentially active inactive compounds” will be referred to as “latent hits”, and their identification is the focus of the present study. The particular knowledge-based strategy adopted for the identification of latent hits for the estrogen receptor subtype R (ERR) and examples of ERR agonist hits derived from the corresponding latent hits are presented next. Identification of Latent Hits. The similarity- property principle states that compounds that are structurally similar are expected to have similar activi- ties. 4 However, every medicinal chemist has often noted that some structural differences have stronger effects on the activity than others. Small modifications will produce in some cases a slight modulation of the activity, whereas in other cases they will result in a dramatic loss of activity. These effects were observed in the context of an HTS campaign aiming at identifying ERR agonists. Table 1 illustrates the effect on the HTS data of small structural changes to the ER endogenous steroid hormone 17-estradiol. On the basis of these HTS data, some qualitative structure-activity relation- ships can be derived. With 17-estradiol (1) as a reference, Table 1 shows that substitution of the 3-hy- droxy group by a methoxy group (2) 5 results in a significant loss of activity whereas substitution of both the 3- and 17-hydroxy groups by methoxy groups (3) 6 leads to a complete loss in activity. The relative loss in activity can be rationalized by considering the number of mutation points in the structure of each compound with respect to the key features present in 17-estradiol, what can be referred to as “pharmacophore latency”. * To whom correspondence should be addressed. Phone: +44 01698 736109. Fax: +44 01698 736187. E-mail: j.mestres@organon.co.uk. Computational Medicinal Chemistry. Lead Discovery Unit. 3441 J. Med. Chem. 2003, 46, 3441-3444 10.1021/jm034078c CCC: $25.00 © 2003 American Chemical Society Published on Web 07/01/2003