Assessing Equilibration and Convergence in Biomolecular Simulations Lorna J. Smith, 1 * Xavier Daura, 2 and Wilfred F. van Gunsteren 2 1 Oxford Centre for Molecular Sciences, Central Chemistry Laboratory, University of Oxford, Oxford, United Kingdom 2 Laboratory of Physical Chemistry, Swiss Federal Institute of Technology Zu ¨ rich, ETH-Ho ¨ nggerberg, Zu ¨ rich, Switzerland ABSTRACT If molecular dynamics simulations are used to characterize the folding of peptides or proteins, a wide range of conformational states needs to be sampled. This study reports an analysis of peptide simulations to identify the best methods for assessing equilibration and sampling in these systems where there is significant conformational disorder. Four trajectories of a peptide in metha- nol and four trajectories of an peptide in water, each of 5 ns in length, have been studied. Compari- sons have also been made with two 50-ns trajecto- ries of the peptide in methanol. The convergence rates of quantities that probe both the extent of conformational sampling and the local dynamical properties have been characterized. These include the numbers of hydrogen bonds populated, clusters identified, and main chain torsion angle transitions in the trajectories. The relative equilibrium rates of different quantities are found to vary significantly between the two systems studied reflecting both the differences in peptide primary structure and the different solvents used. A cluster analysis of the simulation trajectories is identified as a very effec- tive method for judging the convergence of the simulations. This is particularly the case if the analysis includes a comparison of multiple trajecto- ries calculated for the same system from different starting structures. Proteins 2002;48:487– 496. © 2002 Wiley-Liss, Inc. Key words: molecular dynamics; peptide; unfolded conformations; GROMOS; protein fold- ing INTRODUCTION Rapid developments in computer power are giving in- creased possibilities for using molecular dynamics (MD) simulation techniques to provide important insights into protein folding. Recently MD simulations have been re- ported that characterize at an atomic level the folding of peptides 1–5 and a small protein 6 in explicit solvent. MD simulation techniques are also being used to generate models for folding free energy landscapes and for dena- tured and partially folded states of proteins. 7–10 In all these simulations compared with those of proteins in their native state, much wider conformational ensembles need to be explored if meaningful results are to be obtained. Therefore, reliable assessments of the equilibration and the extent of sampling within simulations of these confor- mationally disordered states are required. This is the issue we address here in a study which considers the most effective ways in which the quality of such MD simulations can be judged. Two peptide systems are analyzed in this work. One of these is a 7-residue peptide in methanol and the other an 11-residue peptide in water. The peptide [Fig. 1(A)] is a non-natural peptide that forms a stable left-handed 3 14 helix in methanol. 11 MD simulations of this system show a temperature-dependent equilibrium between the folded 3 14 helix and unfolded conformations. 12 The peptide sequence [Fig. 1(B)] corresponds to residues 105–115 of the protein hen lysozyme (with Cys 115 changed to serine). These residues form an helix (helix D) in the native protein. 13 Experimental studies of the isolated peptide, however, show that it is unstructured in aqueous solu- tion. 14 The and peptides studied here have a very similar number of rotatable (i.e., nonpeptidic) main chain torsion angles (21 for the peptide and 22 for the peptide). However, the characteristics of their sequences differ considerably, all the side chains in the peptide being aliphatic, whereas the peptide contains amino acids with hydrophobic, polar, and charged side chains. The contrasting primary structures and the different solvents used in the simulations results in two systems that differ in flexibility and dynamics. Two 50-ns trajectories and four 5-ns trajectories of the peptide in methanol are studied in this work. One of the 50-ns simulations was started from the folded 3 14 helical conformation and was run at 340K ( 340 1), whereas for the other the initial structure was an extended conformation. A temperature of 360K was used for this second simulation ( 360 1). Two 5-ns trajectories were taken at different time points from each of the longer 50-ns simulations for analysis ( 340 2, 340 3, 360 2, 360 3). Each of the 5-ns trajectories of the peptide therefore has a different Abbreviations: MD, molecular dynamics; RMSD, root-mean-square deviation; RMSF, root-mean-square fluctuation. Grant sponsor: Schweizerischer NationFonds; Grant number: 21- 57069.99 *Correspondence to: L.J. Smith, Central Chemistry Laboratory, University of Oxford, South Parks Road, Oxford, OX1 3QH, UK. E-mail: lorna.smith@chem.ox.ac.uk Received 23 August 2001; Accepted 21 February 2002 Published online 00 Month 2002 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/prot.10144 PROTEINS: Structure, Function, and Genetics 48:487– 496 (2002) © 2002 WILEY-LISS, INC.