Compiler automatic discovery of OmpSs task dependencies Sara Royuela 1 , Alejandro Duran 1,2 , and Xavier Martorell 1 1 Barcelona Supercomputing Center {sara.royuela, xavier.martorell}@bsc.es 2 Intel Corporation alejandro.duran@intel.com Abstract. Dependence analysis is an essential step for many compiler optimiza- tions, from simple loop transformations to automatic parallelization. Parallel pro- gramming models require specific dependence analyses that take into account multi-threaded execution. Furthermore, asynchronous parallelism introduced by OpenMP tasks has promoted the development of new dependency analysis tech- niques. In these terms, OmpSs parallel programming model extends OpenMP tasks with the definition of intertask dependencies. This extension allows run- time dependency detection, which potentially improves the performance when load balancing or locality rule the execution time. On the other side, the exten- sion requires the user to figure out data-sharing attributes and the type of access to each data in all tasks in order to correctly specify the dependencies. We aim to enhance the programmability of OmpSs with a new methodology that enables the compiler to automatically determine the dependencies of OmpSs tasks, thus re- leasing users from the task of manually defining these dependencies. In this con- text, we have developed an algorithm based on the discovery of code concurrent to a task and liveness analysis. The algorithm first finds out all code concurrent with a given task. Then, it computes the data-sharing attributes of the variables appearing in the task. Finally, it analyzes the liveness properties of the task’s shared variables. With this information, the algorithm figures out the proper de- pendencies of the task. We have implemented this algorithm in the Mercurium source-to-source compiler. We have tested the results with several benchmarks proving that the algorithm is able to correctly find a large number of dependency expressions. 1 Introduction The use of parallel programming models is a vital element in the achievement of higher performance and better programmability, in short, greater productivity. OpenMP* has become the most used parallel programming model for shared memory systems by virtue its simplicity and scalability. Although the model already provides a stable and useful standard for the parallelization of structured loops and dense numerical applica- tions, new research directions have appeared. One of them is the concept of task, which has grown as a result of the need of parallelizing applications with different character- istics (e.g., amount of load imbalance in loops, while-loop based, recursiveness, etc.).