Shape Signatures: A New Approach to Computer-Aided Ligand- and
Receptor-Based Drug Design
Randy J. Zauhar,*
,†
Guillermo Moyna,
†
LiFeng Tian,
†
ZhiJian Li,
†
and William J. Welsh
‡
Department of Chemistry & Biochemistry, University of the Sciences in Philadelphia, 600 S. 43rd Street,
Philadelphia, Pennsylvania 19104 and Department of Pharmacology, Robert Wood Johnson Medical School,
University of Medicine and Dentistry of New Jersey, 675 Hoes Lane, Piscataway, New Jersey 08854
Received May 20, 2003
A unifying principle of rational drug design is the use of either shape similarity or
complementarity to identify compounds expected to be active against a given target. Shape
similarity is the underlying foundation of ligand-based methods, which seek compounds with
structure similar to known actives, while shape complementarity is the basis of most receptor-
based design, where the goal is to identify compounds complementary in shape to a given
receptor. These approaches can be extended to include molecular descriptors in addition to
shape, such as lipophilicity or electrostatic potential. Here we introduce a new technique, which
we call shape signatures, for describing the shape of ligand molecules and of receptor sites.
The method uses a technique akin to ray-tracing to explore the volume enclosed by a ligand
molecule, or the volume exterior to the active site of a protein. Probability distributions are
derived from the ray-trace, and can be based solely on the geometry of the reflecting ray, or
may include joint dependence on properties, such as the molecular electrostatic potential,
computed over the surface. Our shape signatures are just these probability distributions, stored
as histograms. They converge rapidly with the length of the ray-trace, are independent of
molecular orientation, and can be compared quickly using simple metrics. Shape signatures
can be used to test for both shape similarity between compounds and for shape complementarity
between compounds and receptors and thus can be applied to problems in both ligand- and
receptor-based molecular design. We present results for comparisons between small molecules
of biological interest and the NCI Database using shape signatures under two different metrics.
Our results show that the method can reliably extract compounds of shape (and polarity) similar
to the query molecules. We also present initial results for a receptor-based strategy using shape
signatures, with application to the design of new inhibitors predicted to be active against HIV
protease.
Introduction
A universal problem in computer-aided drug design
is the comparison of molecular shape.
1-3
In ligand-based
design, the underlying assumption is that a biologically
active compound is complementary in shape to some
target receptor, suggesting that molecules similar in
shape and electrostatic properties to a known active
compound will themselves be complementary to the
receptor and also active. In receptor-based design, the
structure of the target binding site is already known in
atomic detail, and the goal is to directly identify
compounds that are complementary to the site both in
shape and polarity.
A number of methods have been devised for screening
compound libraries for molecules likely to be active
against a selected target.
4-12
Most of these take molec-
ular shape into account, either explicitly or implicitly.
Perhaps the most popular ligand-based strategy that
takes shape explicitly into account is CoMFA
13,14
(com-
parative molecular field analysis) wherein the van der
Waals and electrostatic fields of molecules are sampled
over a grid and used as descriptors in a regression model
intended to predict biological activity. CoMFA thus
includes both molecular shape and polarity. The various
methods for defining pharmacophore models represent
ligand shape implicitly by incorporating some collection
of hydrogen bond acceptors and donors and regions of
steric bulk and imposing intergroup distance con-
straints; this 3D geometric information clearly depends
on molecular shape. A number of approaches have been
developed that compute topological descriptors of mol-
ecules, beginning with chemical structure or starting
with the wave function; such descriptors derive directly
from molecular shape. Even methods based on chemical
fingerprints include implicit shape information, since
only a restricted family of compounds will be compatible
with the chemical and connectivity information con-
tained in the fingerprint.
Receptor-based design strategies generally involve an
explicit representation of shape derived from an atomic-
resolution structure of the active site. For example,
UCSF DOCK
15,16
packs the active site with spheres,
producing an efficient representation of the volume
available to accommodate a ligand and combines this
with positions of hydrogen bond acceptors and donors.
Docking algorithms such as FLOG,
17
GOLD,
18,19
and
FlexiDock
20
use an all-atom representation of the active
* To whom correspondence should be addressed. Phone: 215-596-
8691, Fax: 215-596-8543, e-mail: r.zauhar@usip.edu.
†
University of the Sciences in Philadelphia.
‡
University of Medicine and Dentistry of New Jersey.
5674 J. Med. Chem. 2003, 46, 5674-5690
10.1021/jm030242k CCC: $25.00 © 2003 American Chemical Society
Published on Web 11/19/2003