J Supercomput (2012) 62:787–803 DOI 10.1007/s11227-012-0749-y Stencil computations on heterogeneous platforms for the Jacobi method: GPUs versus Cell BE José M. Cecilia · José L. Abellán · Juan Fernández · Manuel E. Acacio · José M. García · Manuel Ujaldón Published online: 15 February 2012 © Springer Science+Business Media, LLC 2012 Abstract We are witnessing the consolidation of the heterogeneous computing in parallel computing with architectures such as Cell Broadband Engine (Cell BE) or Graphics Processing Units (GPUs) which are present in a myriad of developments for high performance computing. These platforms provide a Software Development Kit (SDK) to maximize performance at the expense of dealing with complex and low-level architectural details which makes the software development a daunting task. This paper explores stencil computations in several heterogeneous program- ming models like Cell SDK, CellSs, ALF and CUDA to optimize the Jacobi method for solving Laplace’s differential equation. We describe the programming techniques to extract the maximum performance on the Cell BE and the GPU, and compare their computing paradigms. Experimental results are shown on two Nvidia Teslas and one J.M. Cecilia () Dept. of Computer Science, Catholic University of Murcia, Murcia, Spain e-mail: jmcecilia@ucam.edu J.L. Abellán · M.E. Acacio · J.M. García Dept. of Computer Engineering, University of Murcia, Murcia, Spain J.L. Abellán e-mail: jlabellan@ditec.um.es M.E. Acacio e-mail: meacacio@ditec.um.es J.M. García e-mail: jmgarcia@ditec.um.es J. Fernández Intel Barcelona Research Center, Intel Labs, Universitat Politècnica de Catalunya, Barcelona, Spain e-mail: juan.fernandez@intel.com M. Ujaldón Computer Architecture Department, University of Malaga, Malaga, Spain e-mail: ujaldon@uma.es