Exact calculation of peptide-protein binding energies by steered thermodynamic integration using high performance computing grids P.W. Fowler, 1, * P.V. Coveney, 1, † S. Jha, 1 and S. Wan 1, 2 1 Centre for Computational Science, Department of Chemistry, University College London, Christopher Ingold Laboratories, 20 Gordon Street, London WC1H 0AJ, United Kingdom 2 Edward Jenner Institute for Vaccine Research, Compton, Berkshire RG20 7NN, United Kingdom We describe a grid-based method to dramatically accelerate from weeks to under 48 hours the calculation of differences in binding free energy and its application to Src homology 2 (SH2) protein cell signalling domains. The method of calculation, thermodynamic integration, is brieﬂy outlined and we indicate how the calculation process works from the perspective of an application scientist using either the UK National Grid Service or the US Teragrid. The development of a PDA-based steering client is especially useful as it gives the application scientist more freedom. Finally, we discuss our experience in developing and deploying the application on a grid. I. INTRODUCTION The ability to rapidly and accurately calculate a dif- ference in binding free energies between two drug candi- dates and their target (usually a protein) is of interest in structural biology and of importance to the pharmaceu- tical industry. Such calculations are difﬁcult to perform, require a substantial amount of computational resource and consequently take weeks to months to complete. It is our aim to dramatically speed-up an existing technique to allow a user to easily complete a binding afﬁnity calcula- tion within 48 hours. The RealityGrid [1] steering library provides a generic computational steering interface [2]. Since April 2004, it has been available to download under a liberal open source license [3]. We have integrated this steering li- brary with NAMD2 and VMD, extending their existing steering capabilities [4]. NAMD2 is a highly-scalable classical molecular dynamics application used primarily for biomolecular simulations [5], and VMD is its sister visualisation package [6]. II. IN-SILICO DRUG DESIGN? There has been a signiﬁcant research effort, both com- mercial and academic, expended in the last ten years to understand the precise nature and mechanism by which small peptides bind to SH2 protein signalling domains (see Figure 1). The ultimate aim of this effort is to de- velop drug leads that inhibit speciﬁc SH2 protein do- mains. These systems are well-studied but remain poorly understood. Different SH2 protein signalling domains are found within many cell signalling pathways. Inhibition of spe- * Electronic address: philip.fowler@ucl.ac.uk † Electronic address: p.v.coveney@ucl.ac.uk ciﬁc SH2 domains is expected to lead to control of spe- ciﬁc pathways through the activation or deactivation of the genes that they regulate. Precisely because these path- ways are so generic, the possibilites to inﬂuence a wide range of ailments, from osteoporosis to immunological disorders, are large [7]. When developing a drug it is vital to know the Gibbs free energy of binding (ΔG) [8]. Relating the strength of binding with structural information, such as how the candidate drug interacts with the SH2 domain, is an es- tablished and important method for gaining insight and thereby developing good drug leads. It is possible to measure experimentally both ΔG and its components, the enthalpy ΔH and the entropy ΔS, using isothermal calorimetry [8]. There also exist computational meth- ods for computing the difference in binding free energy, ΔΔG, between two peptides. We will discuss in this pa- per one such method, that of thermodynamic integration. Thermodynamic integration requires the use of a ther- modynamic cycle as shown in Figure 2. The difference in the free energy of binding between drug A and drug B is given by: ΔΔG AB =ΔG A - ΔG B (1) It is computationally impractical to assess either ΔG A or ΔG B directly. Instead we note that G is a thermody- namic state function and therefore sums to zero around a cycle. This allows us to consider instead ΔΔG AB =ΔG 1 - ΔG 2 (2) Next we assume that G is a continuous function of a pa- rameter, λ, ΔG = 1 Z 0 ∂G(λ) ∂λ dλ (3)