Inverse Preconditioning Techniques on a GPUs Architecture in Global Ocean Models RAFFAELE FARINA University of Naples Federico II Dept. of Applied Mathematics Via Cinthia, 80126, Naples ITALY raffaele.farina@unina.it SALVATORE CUOMO University of Naples Federico II Dept. of Applied Mathematics Via Cinthia, 80126, Naples ITALY salvatore.cuomo@unina.it PASQUALE DE MICHELE University of Naples Federico II Dept. of Applied Mathematics Via Cinthia, 80126, Naples ITALY pasquale.demichele@unina.it MARTA CHINNICI ENEA-UTICT-PRA Casaccia Research Center S. Maria di Galeria, Rome ITALY martachinnici@gmail.com Abstract: In this paper we prove how the use of the GPU NVIDIA architectures could improve the performance of the iterative solvers in the numerical problems related to the global ocean circulation models. In this context a linear system, obtained from the elliptical core of an ocean model by means of ﬁnite differences, is solved using an implementation of the Preconditioned Conjugate Gradient (PCG) on Graphic Processor Units (GPUs). Furthermore, because of the slow convergence of the iterative solver, when the sizes of the grids grow and the domain of the ocean model includes speciﬁc places of the Earth, then the preconditioning technique of the linear system, represented by the diagonal preconditioner, is investigated. Here we show how the combination of a more efﬁcient preconditioning technique with the GPU computing is able to accelerate the iterative solvers of these ocean models, giving the possibility to go further in the simulations. Key–Words: Laplace’s Problem, Preconditioned Conjugate Gradient Method, Multi-Core Architecture, CUDA, CUBLAS and CUSP Libraries. 1 Introduction The ocean models are a component of the global cli- mate models, that are increasingly being used to study not only the ocean dynamics but also the climate sys- tems. Numerical models of ocean circulation support oceanography and climate science by providing tools to mechanistically interpret ocean observations, to ex- perimentally investigate hypotheses for ocean phe- nomena, to consider future scenarios such as those as- sociated with human-induced climate warming, and to forecast ocean conditions on weekly to decade time- scales using dynamical modeling systems. In the ocean climate modeling the time-scales of interest are decades to millennia, but the simulations require res- olutions or parameterizations of phenomena whose time-scales are minutes to hours. There is no obvi- ous place where the space grids resolution are unim- portant, and computational costs have strongly limited the use of novel, but often more expensive, numerical methods. One of the main constraints to increase the grid resolution of the models is given by the size ris- ing of the associated numerical problems, with a con- sequent increment of the mathematical operations re- quired for resolution of these problems. Furthermore, the same numerical problem, solved on different parts of the global domain (the Earth) of these ocean mod- els, may have a resolution cost that can vary consid- erably. An example of this fact occurs very often in the areas near the Earth’s poles. However, although the surface and volume of the polar oceans Arctic and Antarctic are negligible when compared to the global domain, their dynamics have a high inﬂuence on global ocean circulation. This is due to the pro- cesses of melting and freezing involving these areas and resulting in a high energy exchange. Moreover, these areas constitute a large reserve of water and salt, and thus are affected by a high exchange of matter. These facts lead to seriously consider the Earth’s poles in a global ocean model. In this paper we investigate the ocean global model OPA-NEMO [1]. The paral- lel strategy of OPA-NEMO (as for many other ocean models) is the Domain Decomposition [2], where the global domain of integration is divided in many sub- domains that are associated with several processors. These processors belong to a high-performance com- puting architecture of MIMD type. We introduce the GPU computing into the parallel strategy of the ocean models in order to accelerate the execution times of the simulations. In this paper we ﬁrst analyze the existent preconditioning technique used by the ocean model OPA-NEMO, to solve a linear system obtained from its elliptic core by means of the ﬁnite differences. We show some reasons that cause the slow conver- gence of the solver Conjugate Gradient (CG) with di- agonal preconditioner from a theoretical and numer- Recent Researches in Applied Mathematics and Informatics ISBN: 978-1-61804-059-6 15