Identification of “Latent Hits” in
Compound Screening Collections
Jordi Mestres*
,†
and Gerrit H. Veeneman
‡
Computational Medicinal Chemistry, Organon Laboratories
Ltd., Newhouse ML1 5SH, Scotland, U.K., and
Lead Discovery Unit, N.V. Organon,
5340 BH, Oss, The Netherlands
Received April 11, 2003
Abstract: The relatively low hit rates found from high-
throughput screening have raised a question on whether this
technology alone is sufficient to maximally exploit the full
potential of current corporate screening collections. The present
study introduces a knowledge-based strategy for identifying
“latent hits”, i.e., inactive compounds that could potentially
be promoted to hits through simple chemical transformations.
Examples are given of submicromolar agonist hits derived from
the corresponding latent hits for the estrogen receptor.
Introduction. Over the past few years, the pharma-
ceutical industry has relied heavily on high-throughput
screening (HTS) technology for the identification of
hits. Large investments and logistic efforts have been
in setting up robotic equipments and in acquiring,
assembling, and maintaining big corporate compound
collections. However, alongside the established value of
HTS as a means for identifying hits from screening
collections, it has been widely recognized that the
quantity, quality, and diversity of the hits identified
have been below original expectations.
1
In addition, a
number of important quality issues have emerged,
namely, quality of the assays (leading often to poor
quality of data, large numbers of false positives, and
tedious active confirmation process), quality of the
collection (purity and stability of compounds, as well
as origin, composition, and representation of the collec-
tion), and quality of the logistics (purchase, tracking,
handling, and storage of compounds and data manage-
ment). The combination of some or all of these issues is
probably responsible for the relatively low hit rates
found and has led to a significant increase of the cost of
maintenance of HTS.
Despite all efforts, everyone is aware that any corpo-
rate screening library is intrinsically incomplete. While
the estimated number of synthesizable compounds is on
the order of 10
40
, corporate screening collections contain
on the order of 10
6
compounds. It seems therefore naive
to think that compounds possessing the optimal features
arranged in an optimal way around a core structure to
bind to our protein target of interest will habitually be
present in our screening collection. As shown recently,
the probability of a compound having the right key
functional groups arranged optimally decreases dra-
matically as the complexity of the molecule increases.
2
In contrast, it seems likely that compounds having
almost the right features arranged almost optimally
around a core structure will be indeed present in
compound libraries. The subtle differences between the
optimal and the almost optimal presence and arrange-
ment of the key structural features in a compound may
ultimately result in detection of activity for that com-
pound in a HTS assay.
Part of the problem lies in the fact that a significant
number of compounds present in corporate screening
collections were originally acquired from external chemi-
cal suppliers with the aim of covering as much chemical
space as possible within the size of the compound
collection. Consequently, compound selections were
mainly directed by strict diversity criteria, which meant
that in most instances only the centroid compound of a
cluster containing many structurally similar compounds
was selected for purchase. Correspondingly, the chances
of that centroid compound having the optimal features
arranged in the optimal way are very slim. As recently
shown, this centroid selection strategy may potentially
lead to a 70% chance that the activity within the cluster
will not be discovered.
3
As a result of all these factors, realistically, by just
screening, we seldom find optimal highly active com-
pounds but at best suboptimal low active compounds
that still require going through extensive and expensive
optimization programs. However, an unknown number
of nonoptimal inactive compounds that could potentially
be promoted to active compounds by means of simple
chemical transformations remains often entirely over-
looked after completion of an HTS campaign. These
“potentially active inactive compounds” will be referred
to as “latent hits”, and their identification is the focus
of the present study. The particular knowledge-based
strategy adopted for the identification of latent hits for
the estrogen receptor subtype R (ERR) and examples of
ERR agonist hits derived from the corresponding latent
hits are presented next.
Identification of Latent Hits. The similarity-
property principle states that compounds that are
structurally similar are expected to have similar activi-
ties.
4
However, every medicinal chemist has often noted
that some structural differences have stronger effects
on the activity than others. Small modifications will
produce in some cases a slight modulation of the
activity, whereas in other cases they will result in a
dramatic loss of activity. These effects were observed
in the context of an HTS campaign aiming at identifying
ERR agonists. Table 1 illustrates the effect on the HTS
data of small structural changes to the ER endogenous
steroid hormone 17-estradiol. On the basis of these
HTS data, some qualitative structure-activity relation-
ships can be derived. With 17-estradiol (1) as a
reference, Table 1 shows that substitution of the 3-hy-
droxy group by a methoxy group (2)
5
results in a
significant loss of activity whereas substitution of both
the 3- and 17-hydroxy groups by methoxy groups (3)
6
leads to a complete loss in activity. The relative loss in
activity can be rationalized by considering the number
of mutation points in the structure of each compound
with respect to the key features present in 17-estradiol,
what can be referred to as “pharmacophore latency”.
* To whom correspondence should be addressed. Phone: +44 01698
736109. Fax: +44 01698 736187. E-mail: j.mestres@organon.co.uk.
†
Computational Medicinal Chemistry.
‡
Lead Discovery Unit.
3441 J. Med. Chem. 2003, 46, 3441-3444
10.1021/jm034078c CCC: $25.00 © 2003 American Chemical Society
Published on Web 07/01/2003