CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. (2012) Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/cpe.2852 SPECIAL ISSUE PAPER Multiple threads and parallel challenges for large simulations to accelerate a general Navier–Stokes CFD code on massively parallel systems Yvan Fournier 1 , Jerome Bonelle 1 , Pascal Vezolle 2 , Jerry Heyman 3 , Bruce D’Amora 4, * ,† , Karen Magerlein 4 , John Magerlein 4 , Gordon Braudaway 4 , Charles Moulinec 5 and Andrew Sunderland 5 1 Electricité De France R&D, Paris, France 2 IBM Systems and Technology Group, Paris, France 3 IBM Systems and Technology Group, Raleigh, NC 4 IBM T.J. Watson Research Center, Yorktown Heights, NY, USA 5 STFC Daresbury Laboratory, WA4 4AD, Daresbury, UK SUMMARY Computational fluid dynamics is an increasingly important application domain for computational scien- tists. In this paper, we propose and analyze optimizations necessary to run CFD simulations consisting of multibillion-cell mesh models on large processor systems. Our investigation leverages the general industrial Navier–Stokes CFD application, Code_Saturne, developed by Electricité de France for incompressible and nearly compressible flows. In this paper, we outline the main bottlenecks and challenges for massively par- allel systems and emerging processor features such as many-core, transactional memory, and thread level speculation. We also present an approach based on an octree search algorithm to facilitate the joining of mesh parts and to build complex larger unstructured meshes of several billion grid cells. We describe two parallel strategies of an algebraic multigrid solver and we detail how to introduce new levels of parallelism based on compiler directives with OpenMP, transactional memory and thread level speculation, for finite volume cell-centered formulation and face-based loops. A renumbering scheme for mesh faces is proposed to enhance thread-level parallelism. Copyright © 2012 John Wiley & Sons, Ltd. Received 30 May 2011; Accepted 4 April 2012 KEY WORDS: transactional memory; thread level speculation; finite volume method 1. INTRODUCTION Computational fluid dynamics is an increasingly important application domain for computational scientists with a wide range of applications in many industries including automotive, aerospace, and energy. The scale of CFD simulation problems is rapidly increasing because of a demand for higher spatial resolution and more detailed physics, including turbulence modeling where hybrid Reynolds averaged Navier–Stokes (RANS)/large eddy simulations (LES) and even just large eddy simulations, which require smaller cell aspect ratios than pure RANS are beginning to be accepted as modeling tools. Production simulation problems are now reaching the range of hundreds of millions of mesh cells. There is a near-term demand for solutions to problems with complex multibillion-cell meshes and petaflop simulations if the technology can be developed to make this practical. CFD approaches *Correspondence to: Bruce D’Amora, IBM T.J. Watson Research Center, 1101 Kitchwan Road, 15-257 Yorktown Heights, NY 10598. E-mail: damora@us.ibm.com Copyright © 2012 John Wiley & Sons, Ltd.