Evaluating the Impact of OpenMP 4.0 Extensions on Relevant Parallel Workloads ⋆ Raul Vidal, Marc Casas, Miquel Moret´o, Dimitrios Chasapis, Roger Ferrer, Xavier Martorell, Eduard Ayguad´ e, Jes´ us Labarta, and Mateo Valero Barcelona Supercomputing Center (BSC) Universitat Polit` ecnica de Catalunya (UPC) Abstract. OpenMP has been for many years the most widely used pro- gramming model for shared memory architectures. Periodically, new fea- tures are proposed and some of them are finally selected for inclusion in the OpenMP standard. The OmpSs programming model developed at the Barcelona Supercomputing Center (BSC) aims to be an OpenMP forerunner that handles the main OpenMP constructs plus some ex- tra features not included in the OpenMP standard. In this paper we show the usefulness of three OmpSs features not currently handled by OpenMP 4.0 by deploying them over three applications of the PARSEC benchmark suite and showing the performance benefits. This paper also shows performance trade-offs between the OmpSs/OpenMP tasking and loop parallelism constructs and shows how a hybrid implementation that combines both approaches is sometimes the best option. 1 Introduction and Motivation OpenMP has been for many years the most popular programming model for shared memory architectures. The OmpSs programming model [6] developed at the Barcelona Supercomputing Center aims to be an OpenMP forerunner that handles the main OpenMP constructs plus other features not included in the OpenMP standard. OmpSs is based on #pragma annotations and its seman- tics are almost identical to the OpenMP standard. For these reasons, a code in OmpSs that uses only the features included in the OpenMP standard is equiv- alent to its OpenMP counterpart. It is not straightforward to make the choice on which OmpSs features should be adopted by the OpenMP standard and how these new features would interact with the already existing ones. This paper brings some light to the above mentioned dilemmas by pursuing two goals: The first is to show the usefulness of three OmpSs features not cur- rently handled by OpenMP 4.0 by using them to accelerate three well known applications of the PARSEC benchmark suite [3,4]. Secondly, this paper shows ⋆ This paper is published in the Proceedings of the 11th Interna- tional Workshop on OpenMP (IWOMP) 2015. Publication available at http://link.springer.com/book/10.1007/978-3-319-24595-9