Nontargeted Parallel Cascade Selection Molecular Dynamics Using Time-Localized Prediction of Conformational Transitions in Protein Dynamics Ryuhei Harada,* ,† Vladimir Sladek,* ,§,‡ and Yasuteru Shigeta* ,† † Center for Computational Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan § Institute of Chemistry - Centre for Glycomics, Dubravska cesta 9, 84538 Bratislava, Slovakia ‡ Agency for Medical Research and Development (AMED), Chiyoda-ku, Tokyo 100-0004, Japan *S Supporting Information ABSTRACT: Nontargeted parallel cascade selection molec- ular dynamics (nt-PaCS-MD) is an enhanced conformational sampling method of proteins, which does not rely on knowledge of the target structure. It makes use of cyclic resampling from some relevant initial structures to expand the searched conformational subspace. The eﬃciency of nt-PaCS- MD depends on the selections of these initial structures. They are usually stochastically occurring perturbed structures at which larger conformation transitions are about to happen. Reliable identiﬁcation of these is the key to using nt-PaCS- MD. Two new parameters, the moving root-mean-square deviation (mRMSD) and the inner products of the backbone dihedral angles Φ and Ψ, are introduced as indicators of conformational outliers in MD trajectories. Both are based on the analysis of a time-localized set of coordinates, overcoming the need for a target structure while still capturing the complexity of the conformational transition. The reference to which the mRMSD relates is the close surrounding of the i-th conformation, often the (i-1)st one. Hence the name “time-localized” analysis. In this work, we focus on its interplay with nt-PaCS-MD and show that it increases its eﬀectiveness compared to older versions. The target system is the midsized protein T4 lysozyme (in explicit water) on which we demonstrate the open-closed transition without referring to any target conﬁguration. Additionally, we show that the short MD trajectories can be used for the construction of a free energy landscape of the conformational transition based on the Markov state model. 1. INTRODUCTION Proteins often use anisotropic, large-amplitude structural ﬂuctuations to execute their biological functions. Molecular dynamics (MD) simulation is a powerful tool for reproducing/ predicting essential structural ﬂuctuations at atomic-level with fs-time resolution. Owing to recent developments of force ﬁeld parameters, protein dynamics and structural stability of proteins can be studied by MD simulations more quantita- tively. However, it is still challenging to predict biologically relevant rare events to the biological functions because time scales accessible by conventional MD (CMD) are not in the range of characteristic times of certain protein processes ranging from microseconds to seconds. This often leads to insuﬃcient conformational sampling of proteins. To tackle this issue, specialized purpose machines allow us to accelerate the CMD simulation signiﬁcantly. For instance, D. E. Shaw Research has developed a series of special purpose machines called “Anton”. 1−5 Recently, ANTON enabled us to simulate ms-order folding processes for small-size (less than 100 amino- acid residues) proteins with atomic resolution. 2−5 In contrast to the development of hardware, several enhanced sampling methods have been proposed to improve the insuﬃcient conformational sampling of CMD. Examples such as targeted MD, 6 steered MD, 7,8 metadynamics, 9−11 multicanonical MD (McMD), 12 replica-exchange MD (REMD), 13 and its variants 14−21 have been proposed and implemented in well- established MD packages and widely applied to biological targets to elucidate their biological functions induced by the long-time (over microsecond) dynamics. In these enhanced sampling methods, a set of statistically reliable conformational ensembles can be obtained. However, it is diﬃcult to directly reproduce/predict long-time dynamics of proteins with the above-mentioned enhanced conformational sampling methods because they adopt a set of biased potentials or distributed computing based on short-time MD simulations to accelerate their biologically relevant rare events. In the majority of the enhanced sampling methods, external constraints or biases are generally imposed with respect to a given protein, i.e., the optimal external perturbations should be speciﬁed a priori. In contrast to the enhanced sampling methods, we have proposed an external perturbation-free Received: May 20, 2019 Published: August 14, 2019 Article pubs.acs.org/JCTC Cite This: J. Chem. Theory Comput. XXXX, XXX, XXX-XXX © XXXX American Chemical Society A DOI: 10.1021/acs.jctc.9b00489 J. Chem. Theory Comput. XXXX, XXX, XXX−XXX Downloaded via NOTTINGHAM TRENT UNIV on August 29, 2019 at 10:11:12 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.