Routine Microsecond Molecular Dynamics Simulations with AMBER
on GPUs. 2. Explicit Solvent Particle Mesh Ewald
Romelia Salomon-Ferrer,
†
Andreas W. Gö tz,
†
Duncan Poole,
‡
Scott Le Grand,
‡,∥
and Ross C. Walker*
,†,§
†
San Diego Supercomputer Center, University of California, San Diego, 9500 Gilman Drive MC0505, La Jolla, California 92093,
United States
‡
NVIDIA Corporation, 2701 San Tomas Expressway, Santa Clara, California 95050, United States
§
Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive MC0505, La Jolla, California
92093, United States
* S Supporting Information
ABSTRACT: We present an implementation of explicit solvent all atom classical molecular
dynamics (MD) within the AMBER program package that runs entirely on CUDA-enabled
GPUs. First released publicly in April 2010 as part of version 11 of the AMBER MD package and
further improved and optimized over the last two years, this implementation supports the three
most widely used statistical mechanical ensembles (NVE, NVT, and NPT), uses particle mesh
Ewald (PME) for the long-range electrostatics, and runs entirely on CUDA-enabled NVIDIA
graphics processing units (GPUs), providing results that are statistically indistinguishable from
the traditional CPU version of the software and with performance that exceeds that achievable
by the CPU version of AMBER software running on all conventional CPU-based clusters and
supercomputers. We briefly discuss three different precision models developed specifically for this work (SPDP, SPFP, and
DPDP) and highlight the technical details of the approach as it extends beyond previously reported work [Gö tz et al., J. Chem.
Theory Comput. 2012, DOI: 10.1021/ct200909j; Le Grand et al., Comp. Phys. Comm. 2013, DOI: 10.1016/j.cpc.2012.09.022].We
highlight the substantial improvements in performance that are seen over traditional CPU-only machines and provide validation
of our implementation and precision models. We also provide evidence supporting our decision to deprecate the previously
described fully single precision (SPSP) model from the latest release of the AMBER software package.
1. INTRODUCTION
Classical molecular dynamics (MD) has been extensively used
in atomistic studies of biological and chemical phenomena
including the study of biological ensembles of proteins, amino
acids, lipid bilayers, and carbohydrates.
1−13
With the develop-
ment of new algorithms and the emergence of new hardware
platforms, MD simulations have dramatically increased in size,
complexity, and simulation length. In particular, graphics
processing units (GPUs) have emerged as an economical and
powerful alternative to traditional CPUs for scienti fic
computation.
14−17
GPUs are present in most modern high-
end desktops and are now appearing in the latest generation of
supercomputers. When programmed correctly, software run-
ning on GPUs can significantly outperform that running on
CPUs. This is due to a combination of high computational
power, in terms of peak floating point operations, and high
memory bandwidth. This combination makes GPUs an ideal
platform for mathematically intense algorithms that can be
expressed in a highly parallel way. On the downside, the
inherent parallel nature of the GPU architecture necessitates a
decrease in flexibility and an increase in programming
complexity in comparison to CPUs.
The success and high demand for GPUs in the gaming and
3D image rendering industries has fueled the sustained
development of GPUs for over two decades leading to
extremely cost-effective hardware for scientific computations.
The first GPU with features specifically targeted for scientific
computation was released by NVIDIA in 2007 with a
subsequent generation following a year later that provided
the first support for double precision floating point arithmetic.
At the time of writing, NVIDIA’s latest generation of GPUs are
based on the Kepler GK104 and GK110 chips. These two chip
designs, similar to earlier models, provide very different ratios
for single vs double precision performance. The GK104 is
targeted at algorithms that rely extensively on single precision,
while the GK110 offers more extensive double precision
performance. As discussed later, it is necessary to carefully tune
the use of single and double precision floating point and
ultimately fixed precision arithmetic to achieve high perform-
ance across these different hardware designs while not
compromising the integrity of the underlying mathematics.
There are a large number of scientific software packages that
have been successfully ported to run on GPUs.
12,13,18
In the
molecular dynamics field there have been attempts to port
major MD packages to GPUs. For a review of the progress, the
reader is referred to the review article in ref 12. A number of
widely used MD packages designed for the simulation of
condensed phase biological systems exist that feature varying
degrees of GPU support including NAMD,
19,20
AMBER,
21,22
Received: April 17, 2013
Article
pubs.acs.org/JCTC
© XXXX American Chemical Society A dx.doi.org/10.1021/ct400314y | J. Chem. Theory Comput. XXXX, XXX, XXX−XXX