EADock: Docking of Small Molecules Into Protein Active Sites With a Multiobjective Evolutionary Optimization Aure ´lien Grosdidier, y Vincent Zoete, y and Olivier Michielin * Swiss Institute of Bioinformatics (SIB), Molecular Modeling Group, Quartier Sorges, Ba ˆ timent Ge ´nopode, CH-1015 Lausanne, Switzerland ABSTRACT In recent years, protein–ligand docking has become a powerful tool for drug devel- opment. Although several approaches suitable for high throughput screening are available, there is a need for methods able to identify binding modes with high accuracy. This accuracy is essential to reli- ably compute the binding free energy of the ligand. Such methods are needed when the binding mode of lead compounds is not determined experimentally but is needed for structure-based lead optimization. We present here a new docking software, called EADock, that aims at this goal. It uses an hybrid evo- lutionary algorithm with two fitness functions, in combination with a sophisticated management of the diversity. EADock is interfaced with the CHARMM package for energy calculations and coordinate han- dling. A validation was carried out on 37 crystallized protein–ligand complexes featuring 11 different pro- teins. The search space was defined as a sphere of 15 A ˚ around the center of mass of the ligand position in the crystal structure, and on the contrary to other benchmarks, our algorithm was fed with optimized ligand positions up to 10 A ˚ root mean square devia- tion (RMSD) from the crystal structure, excluding the latter. This validation illustrates the efficiency of our sampling strategy, as correct binding modes, defined by a RMSD to the crystal structure lower than 2 A ˚ , were identified and ranked first for 68% of the complexes. The success rate increases to 78% when considering the five best ranked clusters, and 92% when all clusters present in the last generation are taken into account. Most failures could be explained by the presence of crystal contacts in the experimental structure. Finally, the ability of EADock to accurately predict binding modes on a real application was illustrated by the successful docking of the RGD cyclic pentapeptide on the aVb3 integrin, starting far away from the binding pocket. Proteins 2007;67:1010–1025. V V C 2007 Wiley-Liss, Inc. Key words: small ligand docking; evolutionary algo- rithms; rational drug design; EADock INTRODUCTION Structures and Drug Design The number of experimentally resolved protein struc- tures is growing exponentially thanks to huge efforts and improvements in crystallographic techniques. Sev- eral of these protein structures are potential targets for the pharmaceutical industry, and the importance of structure-based drug design has thus increased during the past few years. Several computational approaches based on these structures aim at rationalizing experi- ments by focusing on compounds more likely to have the desired activity, bioavailability, and toxicity. Presently, the most common technique to tackle this problem, called virtual high-throughput screening (VHTS), intends to rank several thousands of small molecules (usually taken from a database) according to a few prop- erties related to the binding to a pharmaceutically rele- vant target. Complementary to VHTS, in silico rational drug design (RDD) suggests structure-based modifica- tions of a lead compound. The Docking Problem Both VHTS and RDD rely heavily on the structural prediction of a complex between a ligand and a targeted receptor. Once the energetically most favorable binding mode for several ligands is identified, they can be ranked according to their estimated affinity and/or activ- ity, in order to guide experiments. This step is extremely sensitive to the accuracy of the predicted binding mode. This article focuses on the first part of the question, known as the docking problem, which may be considered as the optimization of structural and energetic criteria described by a scoring function, given a set of degrees of freedom corresponding to the ligand and the receptor conformations and their relative positions. Existing Approaches An exhaustive exploration of the search space is not feasible because of its size that grows exponentially with y Grosdidier and V. Zoete contributed equally to this work. *Correspondence to: Olivier Michielin, Swiss Institute of Bioinfor- matics (SIB), Quartier Sorges, Ba ˆ timent Ge ´nopode, CH-1015 Lau- sanne, Switzerland. E-mail: olivier.michielin@isrec.unil.ch Grant sponsor: Swiss National Science Foundation; Grant num- bers: 3232B0-103172, 3200B0-103173; Grant sponsor: Oncosuisse; Grant number: OCS 01381-08-2003; Grant sponsors: National Cen- ter of Competence in Research (NCCR); Swiss Institute of Bioinfor- matics. Received 22 September 2006; Revised 22 November 2006; Accepted 12 December 2006 Published online 22 March 2007 in Wiley InterScience (www. interscience.wiley.com). DOI: 10.1002/prot.21367 V V C 2007 WILEY-LISS, INC. PROTEINS: Structure, Function, and Bioinformatics 67:1010–1025 (2007)