From laws of inference to protein folding dynamics Chih-Yuan Tseng * Graduate Institute of Systems Biology and Bioinformatics, National Central University, 320 Chungli, Taiwan Chun-Ping Yu Department of Physics, National Central University, 320 Chungli, Taiwan H. C. Lee Graduate Institute of Systems Biology and Bioinformatics and Department of Physics, National Central University, 320 Chungli, Taiwan Received 4 November 2009; revised manuscript received 12 May 2010; published 18 August 2010 Protein folding dynamics is one of major issues constantly investigated in the study of protein functions. The molecular dynamic MDsimulation with the replica exchange method REMis a common theoretical ap- proach considered. Yet a trade-off in applying the REM is that the dynamics toward the native configuration in the simulations seems lost. In this work, we show that given REM-MD simulation results, protein folding dynamics can be directly derived from laws of inference. The applicability of the resulting approach, the entropic folding dynamics, is illustrated by investigating a well-studied Trp-cage peptide. Our results are qualitatively comparable with those from other studies. The current studies suggest that the incorporation of laws of inference and physics brings in a comprehensive perspective on exploring the protein folding dynamics. DOI: 10.1103/PhysRevE.82.021914 PACS numbers: 87.15.Cc, 87.15.hm, 87.10.Ca, 87.10.Tf I. INTRODUCTION Protein folding dynamics is one of major issues constantly investigated in the study of protein functions. Because the protein folding process involves complicated many-body in- teractions, MD simulation is a common theoretical approach considered. However, one issue hinders the practical usage of MD simulation in studying protein folding processes. As it is recognized from energy landscape theory, protein folding is a series of processes that starts with many possible states and goes through a rough potential energy surface created by many-body interactions 1. It then ends with a few possible states associated with native structures. However, proteins may be trapped in one of local energy minima on the energy surface during simulations. To resolve this issue, the replica exchange method REMhas been proposed 2. However, the introduction of the Monte Carlo aspect in REM seems to lose dynamical information of the folding process. Juraszek and Bolhuis’ recent studies suggest that the dynamics is not lost and is merely hidden beneath the sampling space 3. To reveal the dynamics, they propose to integrate appropriate sampling techniques such as transition pathway sampling TPS4 6in MD simulation. By studying Trp-cage pep- tide folding dynamics, they found two folding trajectories in their simulations and were found to be consistent with the experimental results 3. In this work, we tackle the folding dynamics problem differently by asking “Can we reveal folding dynamics from pure REM-MD simulation results directly? And if so, how?” Because protein folding primarily associates slow processes such as the backbone movement compare to fast atomic mo- tions, the approach hinges on the idea of developing a dy- namical law that specifically takes information relevant to slow folding processes into account. Because the common procedure to develop such physical laws is normally started with the establishment of a mathematical formalism, upon which one then tries to append an interpretation, it is difficult to develop a dynamical law of many bodies, which only takes specific information such as many body interactions into account, based on the procedure. However, a reverse procedure, in which one constructs a physical theory by first deciding what the subject is and what one wants to accomplish, and then designing an appropriate mathematical formalism, provides a solution. Because our goal is to study dynamics of many-body systems by process- ing the corresponding dynamical information directly, the ap- propriate formalisms are found to be laws of inference, con- sistency, objectivity, universality, and honesty. They are sufficiently constraining that they lead to a unique set of rules for processing information: rules of probability theory and the method of maximum entropy ME7,8. Further- more, Caticha argues that information geometry is a conve- nient tool to proceed. An information manifold is constructed based on independent parameters that characterize the sys- tem. The probability distributions of the system at specific states are treated as points in the manifold. The evolution of probability distributions then is simply represented by that a point object “moves” in the manifold. Caticha shows that the dynamics of a physical system can be derived directly from laws of inference 7,9,10. He therefore termed this approach the entropic dynamics. It should be noted that information geometry was originally proposed as a method of applying differential geometry to study statistical estimation please refer to 11for details. It has been successfully applied to * Corresponding author; Department of Oncology, University of Alberta, Edmonton, AB T6G 1Z2, Canada; FAX: 1-780-6434380; chih-yuan.tseng@ualberta.ca Present address: Graduate Institute of Systems Biology and Bio- informatics, National Central University, Chungli, Taiwan 320 PHYSICAL REVIEW E 82, 021914 2010 1539-3755/2010/822/0219149©2010 The American Physical Society 021914-1