Are Current Semiempirical Methods Better Than Force Fields? A Study from the Thermodynamics Perspective Gustavo de M. Seabra, Ross C. Walker, § and Adrian E. Roitberg* ,‡ Quantum Theory Project and Department of Chemistry, UniVersity of Florida, 2234 New Physics Building #92, P.O. Box 118435, GainesVille, Florida 32611-8435, and San Diego Supercomputer Center, UniVersity of California, San Diego, 9500 Gilman DriVe #0505, La Jolla, California 92093-0505 ReceiVed: April 15, 2009; ReVised Manuscript ReceiVed: June 16, 2009 The semiempirical Hamiltonians MNDO, AM1, PM3, RM1, PDDG/MNDO, PDDG/PM3, and SCC-DFTB, when used as part of a hybrid QM/MM scheme for the simulation of biological molecules, were compared on their abilities to reproduce experimental ensemble averages at or near room temperatures for the model system alanine dipeptide in water. Free energy surfaces in the (φ, ψ) dihedral angle space, 3 J(H N ,H R ) NMR dipolar coupling constants, basin populations, and peptide-water radial distribution functions (RDF) were calculated from replica exchange simulations and compared to both experiment and fully classical force field calculations using the Amber ff99SB force field. In contrast with the computational chemist’s intuitive idea that the more expensive a method the better its accuracy, the ff99SB force field results were more accurate than most of the semiempirical methods, with the exception of RM1. None of the methods, however, was able to accurately reproduce the experimental data. Analysis of the results indicate that the specific QM/MM interactions have little influence on the sampling of free energy surfaces, and the differences are well explained simply by the intrinsic properties of the various QM methods. Introduction Semiempirical (SE) methods to solve Schro ¨ dinger’s equation have been extensively tested, compared, and adjusted for more than 20 years now. 1-35 Those tests, however, generally focus on the method’s ability to reproduce static data such as optimized geometries, heats of formation, reaction energies, and spectroscopic parameters, usually at zero temperature. Recent advances in computer processor technology, parallel program- ming, and availability of supercomputer clusters have allowed computational chemists to apply a broader range of methods to systems of ever increasing size and complexity, pushing the semiempirical methods beyond the limits for which they have been designed. The compromise between accuracy and speed provided by semiempirical (SE) methods now allows for significant sampling and treatment of much larger systems without complete forfeiture of quantum mechanical effects, opening the possibility of their application to computational studies of biological molecules in their native environment. For example, SE methods have already been applied in studies ranging from enzyme reactions 36-41 to solution structures of peptides 42-44 and even to structural studies of whole proteins. 45 Their native implementations in popular biomolecular simulation programs such as AMBER 46-48 and CHARMM 49,50 promise to make the use of hybrid quantum mechanics/molecular mechanics (QM/MM) methods even more widespread. For conformational sampling, one can imagine a hierarchy of methods with different computational costs, which are generally believed to be directly proportional to the method’s accuracy. On one end would be the faster force field methods, followed by the polarizable force fields, then semiempirical methods, and finally the much more costly ab initio and density functional methods. As computer capabilities increase and the use of semiempirical methods for larger systems becomes more accessible, it will be tempting to, at some point, just completely discard the use of empirical force fields. In such conditions, it is important to ask the question of whether those SE methods really are the most appropriate for the problems under consid- eration, including very large systems not included in their parametrization sets. There is no doubt QM methods are required for situations where intrinsically quantum processes such as bond breaking and forming, tunneling, or charge redistribution are important. 51-56 However, the SE QM methods currently available have been parametrized against small molecules and reactions, usually to reproduce gas-phase data. 2,20,27,28,32,34,57,58 The parameters thus obtained are not guaranteed to be fully transferable to biological molecules in their natural surroundings. The present work compares the performance of a series of commonly used SE Hamiltonians when used as part of a hybrid QM/MM scheme for the simulation of biological molecules, from a thermodynamics point of view: we focus on their ability to reproduce ensemble properties at or near room temperatures, and at conditions that approach the real biological environment of such molecules. We present results for a model system composed of the alanine dipeptide (Ace-Ala-NMe, Figure 1), an alanine unit blocked by an acetyl group at the N-terminus Part of the “Walter Thiel Festschrift”. University of Florida. § University of California. * roitberg@ufl.edu. Figure 1. Scheme of the capped L-alanine dipeptide depicting the dihedral angles φ and ψ. J. Phys. Chem. A 2009, 113, 11938–11948 11938 10.1021/jp903474v CCC: $40.75 2009 American Chemical Society Published on Web 07/15/2009 Downloaded by UNIV OF MINNESOTA on November 3, 2009 | http://pubs.acs.org Publication Date (Web): July 15, 2009 | doi: 10.1021/jp903474v