1 2 Review 4 Receptor-based virtual screening protocol for drug discovery 5 6 7 Nuno M.F.S.A. Cerqueira, Diana Gesto, Eduardo F. Oliveira, Diogo Santos-Martins, Natércia F. Brás, 8 Sérgio F. Sousa, Pedro A. Fernandes, Maria J. Ramos 9 UCIBIO, REQUIMTE, Departamento de Química e Bioquímica, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal 10 11 12 14 article info 15 Article history: 16 Received 2 March 2015 17 and in revised form 26 May 2015 18 Available online xxxx 19 Keywords: 20 Virtual screening 21 Molecular docking 22 Scoring functions 23 Search algorithms 24 Drug discovery 25 26 abstract 27 Computational aided drug design (CADD) is presently a key component in the process of drug discovery 28 and development as it offers great promise to drastically reduce cost and time requirements. 29 In the pharmaceutical arena, virtual screening is normally regarded as the top CADD tool to screen large 30 libraries of chemical structures and reduce them to a key set of likely drug candidates regarding a specific 31 protein target. This chapter provides a comprehensive overview of the receptor-based virtual screening 32 process and of its importance in the present drug discovery and development paradigm. Following a 33 focused contextualization on the subject, the main stages of a virtual screening campaign, including its 34 strengths and limitations, are the subject of particular attention in this review. In all of these stages spe- 35 cial consideration will be given to practical issues that are normally the Achilles heel of the virtual screen- 36 ing process. 37 Ó 2015 Published by Elsevier Inc. 38 39 40 41 Introduction 42 The process of drug discovery is very complex and requires an 43 interdisciplinary effort to design effective and commercially feasi- 44 ble drugs. The objective of drug design is to find a drug that can 45 interact with a specific drug target and modify its activity. The drug 46 targets are generally proteins that perform most of the tasks 47 needed to keep cells alive. Drugs are small molecules that bind 48 to a specific region of a protein and can turn it on or off. Some very 49 powerful drugs, such as antibiotics or anticancer drugs, are used to 50 completely disable a critical protein in the cell. These drugs can kill 51 bacteria or cancer cells. 52 It is generally recognized that drug discovery and development 53 are very time and resource-consuming processes and the whole 54 process is often compared to searching for a needle in a haystack. 55 It is estimated that a typical drug discovery cycle, from lead iden- 56 tification to clinical trials, can take 17 years with a cost of 800 mil- 57 lion US dollars. In this process it is estimated that five out of 40,000 58 compounds tested in animals eventually reach human testing and 59 only one in five compounds that enter clinical studies is approved. 60 This represents an enormous investment in terms of time, money 61 and human resources. It includes chemical synthesis, purchase, 62 and biological screening of hundreds of thousands of compounds 63 to identify hits followed by their optimization to generate leads, 64 which require further synthesis. In addition, predictability of ani- 65 mal studies in terms of both efficacy and toxicity is frequently sub- 66 optimal. Therefore, new approaches are needed to facilitate, 67 expedite and streamline drug discovery and development, save 68 time, money and resources. 69 On October 5, 1981, Fortune magazine published a cover article 70 entitled ‘‘Next Industrial Revolution: Designing Drugs by Computer 71 at Merck’’. Some have credited this as being the start of intense 72 interest in computer-aided drug design (CADD) 1 [1]. 73 CADD is defined by the IUPAC as all computer assisted tech- 74 niques used to discover, design and optimize compounds with 75 desired structure and properties. CADD has emerged from recent 76 advances in computational chemistry and computer technology, 77 and promises to revolutionize the design of functional molecules. 78 The ultimate goal of CADD is to virtually screen a large database 79 of compounds to generate a set of hit compounds (active drug can- 80 didates), lead compounds (most likely candidates for further eval- 81 uation), or optimize known lead compounds, i.e. transform 82 biologically active compounds into suitable drugs by improving 83 their physicochemical, pharmaceutical and ADMET/PK (pharma- 84 cokinetic) properties [2]. http://dx.doi.org/10.1016/j.abb.2015.05.011 0003-9861/Ó 2015 Published by Elsevier Inc. Corresponding author. E-mail address: mjramos@fc.up.pt (M.J. Ramos). 1 Abbreviations used: CADD, computer-aided drug design; VS, virtual screening; MC, Monte Carlo; GA, Genetic Algorithms; RMSD, Root Mean Square Deviation; MD, Molecular Dynamics; FEP, Free Energy Perturbation; TI, Thermodynamic Integration; PLP, Piecewise Linear Potential; PMF, Potential of Mean Force; EF, enrichment factor; ROC, Receiver Operator Characteristic; TPR, true positive rate; FPR, false positive rate. Archives of Biochemistry and Biophysics xxx (2015) xxx–xxx Contents lists available at ScienceDirect Archives of Biochemistry and Biophysics journal homepage: www.elsevier.com/locate/yabbi YABBI 6986 No. of Pages 12, Model 5G 1 June 2015 Please cite this article in press as: N.M.F.S.A. Cerqueira et al., Arch. Biochem. Biophys. (2015), http://dx.doi.org/10.1016/j.abb.2015.05.011