mathematics
Article
Cross-Platform GPU-Based Implementation of Lattice
Boltzmann Method Solver Using ArrayFire Library
Michal Takᡠc and Ivo Petrᚠ*
Citation: Takᡠc, M.; Petráš, I.
Cross-Platform GPU-Based
Implementation of Lattice Boltzmann
Method Solver Using ArrayFire
Library. Mathematics 2021, 9, 1793.
https://doi.org/10.3390/math9151793
Academic Editor: Panagiota
Tsompanopoulou
Received: 31 May 2021
Accepted: 26 July 2021
Published: 28 July 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Faculty BERG, Technical University of Košice, Nˇ emcovej 3, 042 00 Košice, Slovakia; michal.takac@tuke.sk
* Correspondence: ivo.petras@tuke.sk; Tel.: +421-55-602-5194
Abstract: This paper deals with the design and implementation of cross-platform, D2Q9-BGK and
D3Q27-MRT, lattice Boltzmann method solver for 2D and 3D flows developed with ArrayFire library
for high-performance computing. The solver leverages ArrayFire’s just-in-time compilation engine
for compiling high-level code into optimized kernels for both CUDA and OpenCL GPU backends.
We also provide C++ and Rust implementations and show that it is possible to produce fast cross-
platform lattice Boltzmann method simulations with minimal code, effectively less than 90 lines of
code. An illustrative benchmarks (lid-driven cavity and Kármán vortex street) for single and double
precision floating-point simulations on 4 different GPUs are provided.
Dataset License: MIT
Keywords: lattice Boltzmann method (LBM); computational fluid dynamics (CFD); parallel comput-
ing; graphics processing unit (GPU) computing; ArrayFire library; numerical analysis
1. Introduction
Popularity of the lattice Boltzmann method (LBM) has steadily grown since its incep-
tion from lattice gas automata [1] more than three decades ago. The lattice gas automata are
a type of cellular automaton used to simulate fluid flows and they were the precursor to the
LBM. From lattice gas automata it is possible to derive the macroscopic Navier–Stokes equa-
tions. A disadvantage of the lattice gas automata method is the statistical noise. Another
problem is the difficulty in expanding the model to 3D case. Because of these reasons the
LBM started to rise in early 1990s as an alternative procedure [2]. As a mesoscopic method,
filling the gap between macroscopic Navier–Stokes solvers and microscopic molecular
dynamics, it has been an important tool for numerical simulations of multi-component,
multiphase flows [3–5], flows in porous media [6,7], turbulent flows [8], and lately for
more complex flows of fluids, as for instance, Bose–Einstein condensate [9], interaction of
(2+1)-dimensional solitons [10], or modeling of viscous quasi-incompressible flows [11].
Thanks to the computational simplicity of LBM and its spatial and temporal locality, it is
naturally suited to parallel computing [12].
Recently, the increase in computational power and advances in general-purpose
computing on GPUs (GPGPU) opened the door for real-time and interactive computational
fluid dynamics (CFD) simulations [13–17]. Together with the performance and speed of
the LBM method, it is now possible to compute more than several hundreds of iterations
per second which makes an interaction with the simulation in progress possible [18].
Getting instant feedback according to the change of various parameters in simulation
gives researchers the ability to iterate faster toward the creation of accurate model, better
understanding of underlying phenomena, or employing simulation within the control
of industrial systems. It is, therefore, desirable to push the limits of execution speed
of LBM simulations. Developers have to be careful with the memory limitations, even
though GPUs provide high memory bandwidth, as LBM algorithms tend to consume
large amounts of memory for storing the data. GPU architecture is designed for high
Mathematics 2021, 9, 1793. https://doi.org/10.3390/math9151793 https://www.mdpi.com/journal/mathematics