J. Parallel Distrib. Comput. 72 (2012) 1057–1064 Contents lists available at SciVerse ScienceDirect J. Parallel Distrib. Comput. journal homepage: www.elsevier.com/locate/jpdc Acoustic scattering solver based on single level FMM for multi-GPU systems Miguel López-Portugués a , Jesús A. López-Fernández a , Jonatan Menéndez-Canal b , Alberto Rodríguez-Campa b , José Ranilla b, a Departamento de Ingeniería Eléctrica, Electrónica, de Computadores y Sistemas, Universidad de Oviedo, Spain b Departamento de Informática, Universidad de Oviedo, Spain article info Article history: Received 25 February 2011 Received in revised form 15 July 2011 Accepted 27 July 2011 Available online 12 August 2011 Keywords: FMM GPGPU Heterogeneous Acoustic scattering abstract In this paper, we present a heterogeneous parallel solver of a high frequency single level Fast Multipole Method (FMM) for the Helmholtz equation applied to acoustic scattering. The developed solution uses multiple GPUs to tackle the compute bound steps of the FMM (aggregation, disaggregation, and near interactions) while the CPU handles a memory bound step (translation) using OpenMP. The proposed solver performance is measured on a workstation with two GPUs (NVIDIA GTX 480) and is compared with that of a distributed memory solver run on a cluster of 32 nodes (HP BL465c) with an Infiniband network. Some energy efficiency results are also presented in this work. © 2011 Elsevier Inc. All rights reserved. 1. Introduction Nowadays, there are stringent requirements for environmen- tal noise [2] which are a significant design driver for new aircraft structures. As a consequence, the implementation of computational tools that accurately model and predict the acous- tic scattering may dramatically increase the efficiency of the whole manufacturing process. The Boundary Elements Method (BEM) [25] is an accurate nu- merical approach for solving acoustic scattering problems. Nev- ertheless, its computational cost may be prohibitively expensive for solving large-scale problems. The BEM yields a linear system with N equations and N unknowns, whose direct solution is O N 3 in time and O N 2 in memory. Since N increases according to the scatterer size in wavelengths (proportional to the frequency squared, f 2 , for surface discretizations), the efficient solution of the BEM linear system for real-world geometries and practical fre- quencies represents an interesting computational challenge. The time cost is reduced to O N 2 per iteration using efficient iterative solvers, for instance the Generalized Minimum Residual (GMRES) method [22]. In addition, the high frequency single level Fast Mul- tipole Method (FMM) [21] and its multilevel version – also known Corresponding author. E-mail addresses: mlopez@tsc.uniovi.es (M. López-Portugués), lopezjesus@uniovi.es (J.A. López-Fernández), UO189380@uniovi.es (J. Menéndez-Canal), rodriguezcalberto@uniovi.es (A. Rodríguez-Campa), ranilla@uniovi.es (J. Ranilla). as Multilevel Fast Multipole Algorithm (MLFMA) [23] – reduce the iteration cost to O N 1.5 and to O (N log(N )), respectively, when applied to the BEM for the Helmholtz equations. The FMM uses a multipole expansion of Green’s function that permits an efficient computation of the matrix–vector products (MVPs) of the iterative solver, reducing the computational cost, without significantly af- fecting its accuracy. High frequency FMM for the Helmholtz equation in two dimensions was first published in [20] and presented for three dimensions in [21]. A practical description of the algorithm for a single level appears in [6]. Nonetheless, high frequency FMM may be unstable when the size of the group is smaller than a certain threshold [8]. In [8], Green’s function is expanded using a combination of evanescent and propagating waves, yielding a stable FMM for low-frequencies. A description of both low frequency and high frequency FMM is detailed in [5]. In the past years, some proposals have been made to produce an efficient and accurate FMM at a wide range of frequencies (adaptive FMM). This may be of major concern for problems in which the discretization is in someway uneven due to the necessity of keeping a high geometric resolution for low frequencies. It is worth mentioning the wideband approach presented in [10] that does not require neither interpolation nor filtering, resulting in an improvement of the work in [4]. In addition to scattering, the FMM is applied to a wide range of engineering problems related to acoustics. For instance, in [12] the Head Related Transfer Functions (HRTFs) are efficiently simulated for a wide frequency range taking advantage of the FMM. In [17], the FMM is used to accelerate the evaluation of the topological 0743-7315/$ – see front matter © 2011 Elsevier Inc. All rights reserved. doi:10.1016/j.jpdc.2011.07.013