An Improved Algorithm for Analytical Gradient Evaluation in Resolution-of-the-Identity Second-Order Møller-Plesset Perturbation Theory: Application to Alanine Tetrapeptide Conformational Analysis ROBERT A. DISTASIO, JR., RYAN P. STEELE, YOUNG MIN RHEE, YIHAN SHAO, MARTIN HEAD-GORDON Department of Chemistry, University of California, Berkeley, California 94720 Received 30 March 2006; Revised 29 June 2006; Accepted 20 July 2006 DOI 10.1002/jcc.20604 Published online 11 January 2007 in Wiley InterScience (www.interscience.wiley.com). Abstract: We present a new algorithm for analytical gradient evaluation in resolution-of-the-identity second-order Møl- ler-Plesset perturbation theory (RI-MP2) and thoroughly assess its computational performance and chemical accuracy. This algorithm addresses the potential I/O bottlenecks associated with disk-based storage and access of the RI-MP2 t-amplitudes by utilizing a semi-direct batching approach and yields computational speed-ups of approximately 2–3 over the best con- ventional MP2 analytical gradient algorithms. In addition, we attempt to provide a straightforward guide to performing reli- able and cost-efficient geometry optimizations at the RI-MP2 level of theory. By computing relative atomization energies for the G3/99 set and optimizing a test set of 136 equilibrium molecular structures, we demonstrate that satisfactory relative accuracy and significant computational savings can be obtained using Pople-style atomic orbital basis sets with the existing auxiliary basis expansions for RI-MP2 computations. We also show that RI-MP2 geometry optimizations reproduce molec- ular equilibrium structures with no significant deviations (>0.1 pm) from the predictions of conventional MP2 theory. As a chemical application, we computed the extended-globular conformational energy gap in alanine tetrapeptide at the extrapo- lated RI-MP2/cc-pV(TQ)Z level as 2.884, 4.414, and 4.994 kcal/mol for structures optimized using the HF, DFT (B3LYP), and RI-MP2 methodologies and the cc-pVTZ basis set, respectively. These marked energetic discrepancies originate from differential intramolecular hydrogen bonding present in the globular conformation optimized at these levels of theory and clearly demonstrate the importance of long-range correlation effects in polypeptide conformational analysis. q 2007 Wiley Periodicals, Inc. J Comput Chem 28: 839–856, 2007 Key words: second-order Møller-Plesset theory; RI-MP2; MP2 analytical gradient; resolution-of-the-identity approxima- tion; alanine; Pople-style basis sets; auxiliary basis sets; density-fitting; equilibrium geometries; force field parameters Introduction The ability to accurately predict molecular equilibrium structures has been one of the primary driving forces for the use of compu- tational chemistry. In fact, theoretical predictions of molecular geometries have been deemed so reliable that many researchers consider them a viable alternative to experimental structure deter- mination; at this point, the usefulness of theory is no longer re- stricted to cases where structural information might be extremely difficult, or even impossible, to obtain using current state-of-the- art experimental methods. 1–4 For many experimentalists, perform- ing geometry optimizations using standard electronic structure methods has become a routine practice during the analysis of ex- perimental findings, and more often than not, even calculations performed on a personal desktop computer can provide valuable insight and even influence future experimental directions. The recent works of Helgaker and coworkers 5,6 have assessed the performance of the standard hierarchy of ab initio models in the structural optimization of molecular systems containing first and second row atoms. This hierarchy of electronic structure meth- ods starts with the mean-field Hartree-Fock (HF) approximation, 7,8 continues with the simplest correlation treatment, second-order Møller-Plesset perturbation theory (MP2), 9,10 and then the higher level coupled-cluster theories that include single and double excita- tions (CCSD) 11 as well as perturbative triples (CCSD(T)). 12,13 Although these methods are readily available in most computa- tional software packages, their practical use is often heavily restricted by the size of the molecular system of interest. In fact, geometry optimizations of molecular systems comprised of only *This article contains supplementary material available via the Internet at http://www.interscience.wiley.com/jpages/0192-8651/suppmat Contract/grant sponsors: NIH SBIR; National Science Foundation Correspondence to: M. Head-Gordon; e-mail: mhg@bastille.cchem. berkeley.edu q 2007 Wiley Periodicals, Inc.