SIRAH: A Structurally Unbiased Coarse-Grained Force Field for Proteins with Aqueous Solvation and Long-Range Electrostatics Leonardo Darre ́ , , Matías Rodrigo Machado, Astrid Febe Brandner, Humberto Carlos Gonza ́ lez, Sebastia ́ n Ferreira, and Sergio Pantano* , Institut Pasteur de Montevideo, Montevideo, Uruguay Department of Chemistry, Kings College, London, United Kingdom ABSTRACT: Modeling of macromolecular structures and interactions represents an important challenge for computational biology, involving dierent time and length scales. However, this task can be facilitated through the use of coarse-grained (CG) models, which reduce the number of degrees of freedom and allow ecient exploration of complex conformational spaces. This article presents a new CG protein model named SIRAH, developed to work with explicit solvent and to capture sequence, temperature, and ionic strength eects in a topologically unbiased manner. SIRAH is implemented in GROMACS, and interactions are calculated using a standard pairwise Hamiltonian for classical molecular dynamics simulations. We present a set of simulations that test the capability of SIRAH to produce a qualitatively correct solvation on dierent amino acids, hydrophilic/ hydrophobic interactions, and long-range electrostatic recognition leading to spontaneous association of unstructured peptides and stable structures of single polypeptides and protein-protein complexes. INTRODUCTION The exponential growth of computer power added to the development of faster algorithms has contributed to make molecular simulations a reliable tool for the study of biomolecular systems. Nevertheless, direct comparison with experimental data is often dicult owing to the large size and long time scales needed for a proper description of the complex biological environment. These diculties have motivated the development of simplied methods aimed to bridge the gap between experiments and simulations. A large number of coarse-grained (CG) molecular representations have been described in the literature for the simulation of the most common biological species. 1-18 In general, the microscopic details are coarsened following either top-down or bottom-up approaches. In bottom-up schemes, a given Hamiltonian function is chosen and parametrized to t ne-grained (FG) simulations taken as a reference. Several strategies to derive CG potentials have been developed on the basis of mining degrees of freedom from FG simulations through force matching techniques, Boltzmann inversion, thermodynamic integration, etc. 19,20 In top-down approaches, force elds are often tailored on the basis of physicochemical intuition and/or trial and error simulations, and interaction parameters are tted to match available experimental data. Bottom-up strategies can produce very accurate potentials and are very well suited for the description of uniform systems. However, it may be dicult to derive a general and transferable CG force eld for highly heterogeneous macromolecules as proteins. 20 On the other hand, the accuracy of top-down models may be strongly related to the availability of experimental data but may provide potentials that are more easily transferable. 21 For recent reviews on dierent CG approaches, see Ingolfsson et al. 22 and Brini et al. 23 Recently, our group has undertaken the initiative to develop a CG force eld for biomolecules named SIRAH (http://www. sirah.com). We followed a top-down approach tting structural properties of macromolecules using a standard pairwise Hamiltonian common to most MD simulation packages. So far, the SIRAH force eld includes parameters and topologies for simulating DNA using an implicit solvation scheme 24,25 or embedded in an explicit CG representation of aqueous solvation. 26 Our CG model for water (named WatFour or WT4 for shortness) is composed by four linked beads, each carrying a partial charge. This confers to WT4 the capacity to create its own dielectric permittivity, while the use of CG electrolytes helps to account for ionic strength eects and osmotic pressure. 26 The WT4 model has been recently shown to be suitable for hybrid or dual-resolution simulations, where regions of interest within molecular systems can be treated at full atomic detail, while bulk regions of the solvent are simulated at the CG level without perturbing the structure and dynamics of the atomistic part. 27-29 Along this line, we have also expanded our force eld to consider a dual-resolution version of double stranded DNA 30 compatible with the FG AMBER99 force eld. 31 Received: August 26, 2014 Published: December 17, 2014 Article pubs.acs.org/JCTC © 2014 American Chemical Society 723 DOI: 10.1021/ct5007746 J. Chem. Theory Comput. 2015, 11, 723-739