Optimization of High Throughput Virtual Screening by Combining Shape-Matching and Docking Methods Hui Sun Lee, Jiwon Choi, Irina Kufareva, Ruben Abagyan, Anton Filikov, § Young Yang, and Sukjoon Yoon* ,† Department of Biological Sciences, Research Center for Women’s Diseases (RCWD), Sookmyung Women’s University, Hyochangwongil 52, Yongsan-gu, Seoul, Republic of Korea 140-742, Department of Molecular Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, and ArQule, Inc., 19 Presidential Way, Woburn, Massachusetts 01801 Received October 20, 2007 Receptor flexibility is a critical issue in structure-based virtual screening methods. Although a multiple- receptor conformation docking is an efficient way to account for receptor flexibility, it is still too slow for large molecular libraries. It was reported that a fast ligand-centric, shape-based virtual screening was more consistent for hit enrichment than a typical single-receptor conformation docking. Thus, we designed a “distributed docking” method that improves virtual high throughput screening by combining a shape-matching method with a multiple-receptor conformation docking. Database compounds are classified in advance based on shape similarities to one of the crystal ligands complexed with the target protein. This classification enables us to pick the appropriate receptor conformation for a single-receptor conformation docking of a given compound, thereby avoiding time-consuming multiple docking. In particular, this approach utilizes cross-docking scores of known ligands to all available receptor structures in order to optimize the algorithm. The present virtual screening method was tested for reidentification of known PPARγ and p38 MAP kinase active compounds. We demonstrate that this method improves the enrichment while maintaining the computation speed of a typical single-receptor conformation docking. INTRODUCTION In recent years, virtual high throughput screening (VHTS) has become an essential technique for the discovery of new lead compounds, and it has served as an alternative to experimental high throughput screening in drug discovery. The importance of VHTS in drug discovery is increasing simultaneously with the rapidly growing number of small molecules available in corporate and public libraries. 1 A plethora of available target proteins with high-resolution crystal structures has also accelerated the development of structure-based VHTS methods. Despite recent theoretical and technical improvements in the field, 2 the performance of VHTS methods is still sometimes unsatisfactory in part due to the flexible nature of receptor conformation. 3 VHTS methods can be classified into two categories: ligand-centric and receptor-centric virtual screening. Ligand- centric methods essentially focus on comparative analysis of the structural shape and chemical or pharmacophore similarity between compounds and known ligands. Therefore, the knowledge of experimentally selected active compounds is a prerequisite for applying ligand-centric methods. 4 On the other hand, receptor-centric methods predict interaction of a given compound with a target receptor. This does not necessarily require experimental data on active compounds. Molecular docking, which is a key method in receptor-centric virtual screening, is a technique that uses computers to predict a binding mode and affinity of a given compound for a target receptor. 5 Docking is a central component in many lead discovery strategies. 6 A critical issue in receptor-centric virtual screening is to incorporate a dynamic nature of receptor structures. Com- monly in molecular docking algorithms, the target protein is kept rigid in a single low-energy conformation, and only conformational and positional flexibility of a ligand is considered. Proteins, however, can have different confor- mational states with similar energies. In many cases binding site conformation of a receptor exhibits significant motion including rearrangements of side chains and backbone upon ligand binding. This is called ‘an induced fit’. 7 Even small local motions of side chains may significantly impact docking results. 8 Therefore, using a single receptor conformation in docking experiments can lead to errors in identification of binding modes and errors in prediction of binding affinities. This can significantly reduce the chances of finding new ligands. 9 In such a flexible system no clear relationship between docking and ranking was found. 10 There were various attempts to include protein flexibility in the virtual molecular docking procedure. A simple approach is to reduce the van der Waals radii of the receptor and/or ligand atoms or delete some of the side chains in order to eliminate possible close contacts due to rigidity of the receptor conformation. 11 Another approach is to use an ensemble of experimental receptor conformations in ligand docking. Knegtel et al. used crystal and solution structures to generate combined interaction grids by averaging with * Corresponding author phone: 82-2-710-9415; fax: 82-2-2077-7322; e-mail: yoonsj@sookmyung.ac.kr. Sookmyung Women’s University. The Scripps Research Institute. § ArQule, Inc. 489 J. Chem. Inf. Model. 2008, 48, 489-497 10.1021/ci700376c CCC: $40.75 © 2008 American Chemical Society Published on Web 02/27/2008