Computer Physics Communications 183 (2012) 1890–1898 Contents lists available at SciVerse ScienceDirect Computer Physics Communications journal homepage: www.elsevier.com/locate/cpc Multi-GPU acceleration of direct pore-scale modeling of fluid flow in natural porous media Saeed Ovaysi , Mohammad Piri Department of Chemical and Petroleum Engineering, University of Wyoming, Laramie, WY 82071-2000, USA article info Article history: Received 23 June 2011 Received in revised form 8 March 2012 Accepted 16 April 2012 Available online 25 April 2012 Keywords: GPU computing Parallel programming Moving Particle Semi-implicit Particle-based methods Porous media MMPS abstract Modified Moving Particle Semi-implicit (MMPS) is a particle-based method used to simulate pore-scale fluid flow through disordered porous media. We present a multi-GPU implementation of MMPS for hybrid CPU–GPU clusters using NVIDIA’s Compute Unified Device Architecture (CUDA). Message Passing Interface (MPI) functions are used to communicate between different nodes of the cluster and hence their respective GPUs. The accuracy and stability of the GPU implementation of MMPS are verified through careful comparison with the results obtained on conventional CPU-only clusters. We then examine the speedup and scalability of the GPU implementation for pore-scale flow simulations in samples with various sizes taken from the same natural porous system. We achieve a 134× speedup with 60 graphics cards compared to 6 CPU cores while maintaining a linear scalability. Incompressible fluid flow simulation to reach steady-state through a 1 mm × 1 mm × 8 mm microtomography image of Bentheimer sandstone is also performed in less than 1 h. © 2012 Elsevier B.V. All rights reserved. 1. Introduction Fluid flow in porous media is of great importance in many areas of science and technology including petroleum production, hydrology, and environmental remediation. Better understanding of fluid flow in porous media, however, requires examining the physics of fluid flow at the pore level (micron level). Currently, it is very difficult to achieve this goal through only experimental means. Therefore, it is crucial to develop models that are capable of reproducing the true physics of fluid flow through porous media at the pore level. Recently, Modified Moving Particle Semi-implicit (MMPS) has been developed to directly model fluid flow in disordered porous media at the pore level [1]. MMPS, when applied on high-resolution images obtained using X-ray microtomography, can shed light on transport phenomena in natural porous media [2]. However, the high resolutions required to capture the pore-level complexities of natural porous media make the simulations computationally expensive. We have previously presented a parallel implementation of the method which scales linearly on distributed memory clusters [1]. Nonetheless, several hours are still required to complete a physically useful simulation on more than 200 processing cores. Fortunately, with the advances made in General Purpose computation on Graphics Processing Units (GPGPU), it is possible to reduce the compuational cost significantly at a considerably lower price. In this paper, we use Compute Unified Device Architecture (CUDA) developed by NVIDIA [3] to perform the simulations on Graphics Processing Units (GPUs). CUDA, which is an extension of C, is integrated with Message Passing Interface (MPI) to allow simulations across a multi-GPU platform with distributed memory. The underlying architecture of the code is written in C++. In the following sections, we first briefly introduce the computational algorithm of MMPS. Next, our single-GPU algorithm is discussed. We then integrate the single- GPU code with the domain decomposition technique facilitated by MPI to finalize a multi-GPU code that runs on distributed memory computer clusters. This is then followed by our scalability results obtained using the above-mentioned code. Finally, we present a case study in which fluid flow in a large sample is simulated using the multi-GPU code. 2. MMPS MMPS is a Lagrangian particle-based method used to solve the incompressible Navier–Stokes equations in disordered porous media. The voxel image of the porous medium renders itself to a particle-based representation where the rock (void) space is mapped into solid Corresponding author. Tel.: +1 307 766 4923; fax: +1 307 766 6777. E-mail addresses: sovaysi@uwyo.edu (S. Ovaysi), mpiri@uwyo.edu (M. Piri). 0010-4655/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.cpc.2012.04.007