Shape Signatures: A New Approach to Computer-Aided Ligand- and Receptor-Based Drug Design Randy J. Zauhar,* ,† Guillermo Moyna, LiFeng Tian, ZhiJian Li, and William J. Welsh Department of Chemistry & Biochemistry, University of the Sciences in Philadelphia, 600 S. 43rd Street, Philadelphia, Pennsylvania 19104 and Department of Pharmacology, Robert Wood Johnson Medical School, University of Medicine and Dentistry of New Jersey, 675 Hoes Lane, Piscataway, New Jersey 08854 Received May 20, 2003 A unifying principle of rational drug design is the use of either shape similarity or complementarity to identify compounds expected to be active against a given target. Shape similarity is the underlying foundation of ligand-based methods, which seek compounds with structure similar to known actives, while shape complementarity is the basis of most receptor- based design, where the goal is to identify compounds complementary in shape to a given receptor. These approaches can be extended to include molecular descriptors in addition to shape, such as lipophilicity or electrostatic potential. Here we introduce a new technique, which we call shape signatures, for describing the shape of ligand molecules and of receptor sites. The method uses a technique akin to ray-tracing to explore the volume enclosed by a ligand molecule, or the volume exterior to the active site of a protein. Probability distributions are derived from the ray-trace, and can be based solely on the geometry of the reflecting ray, or may include joint dependence on properties, such as the molecular electrostatic potential, computed over the surface. Our shape signatures are just these probability distributions, stored as histograms. They converge rapidly with the length of the ray-trace, are independent of molecular orientation, and can be compared quickly using simple metrics. Shape signatures can be used to test for both shape similarity between compounds and for shape complementarity between compounds and receptors and thus can be applied to problems in both ligand- and receptor-based molecular design. We present results for comparisons between small molecules of biological interest and the NCI Database using shape signatures under two different metrics. Our results show that the method can reliably extract compounds of shape (and polarity) similar to the query molecules. We also present initial results for a receptor-based strategy using shape signatures, with application to the design of new inhibitors predicted to be active against HIV protease. Introduction A universal problem in computer-aided drug design is the comparison of molecular shape. 1-3 In ligand-based design, the underlying assumption is that a biologically active compound is complementary in shape to some target receptor, suggesting that molecules similar in shape and electrostatic properties to a known active compound will themselves be complementary to the receptor and also active. In receptor-based design, the structure of the target binding site is already known in atomic detail, and the goal is to directly identify compounds that are complementary to the site both in shape and polarity. A number of methods have been devised for screening compound libraries for molecules likely to be active against a selected target. 4-12 Most of these take molec- ular shape into account, either explicitly or implicitly. Perhaps the most popular ligand-based strategy that takes shape explicitly into account is CoMFA 13,14 (com- parative molecular field analysis) wherein the van der Waals and electrostatic fields of molecules are sampled over a grid and used as descriptors in a regression model intended to predict biological activity. CoMFA thus includes both molecular shape and polarity. The various methods for defining pharmacophore models represent ligand shape implicitly by incorporating some collection of hydrogen bond acceptors and donors and regions of steric bulk and imposing intergroup distance con- straints; this 3D geometric information clearly depends on molecular shape. A number of approaches have been developed that compute topological descriptors of mol- ecules, beginning with chemical structure or starting with the wave function; such descriptors derive directly from molecular shape. Even methods based on chemical fingerprints include implicit shape information, since only a restricted family of compounds will be compatible with the chemical and connectivity information con- tained in the fingerprint. Receptor-based design strategies generally involve an explicit representation of shape derived from an atomic- resolution structure of the active site. For example, UCSF DOCK 15,16 packs the active site with spheres, producing an efficient representation of the volume available to accommodate a ligand and combines this with positions of hydrogen bond acceptors and donors. Docking algorithms such as FLOG, 17 GOLD, 18,19 and FlexiDock 20 use an all-atom representation of the active * To whom correspondence should be addressed. Phone: 215-596- 8691, Fax: 215-596-8543, e-mail: r.zauhar@usip.edu. University of the Sciences in Philadelphia. University of Medicine and Dentistry of New Jersey. 5674 J. Med. Chem. 2003, 46, 5674-5690 10.1021/jm030242k CCC: $25.00 © 2003 American Chemical Society Published on Web 11/19/2003