International Journal of Computer Applications (0975 8887) Volume 86 - No. 9, January 2014 Bridging the Performance Gap between Manual and Automatic Compilers with Intent-based Compilation Waseem Ahmed College of Computer Science King Khalid University, Abha, Saudi Arabia ABSTRACT In spite of years of research in automatic parallelization, progress has been slow in terms of tools that can consistently generate scal- able, portable and efficient code for multiple architectures. More- over, a substantial difference in efficiency exists between code gen- erated automatically and code generated by an expert programmer. Although the fact that the best sequential algorithm for a problem can be very different from the best parallel algorithm is well known, the feature of algorithm substitution is absent from most tools avail- able today. However, automatically identifying an algorithm used in code is not trivial considering the nuances in programming style, algorithmic representations and expressions. This paper presents a novel Intent Based Compilation approach that uses a rule-based Expert System Engine to identify the intent of the algorithm used in the code based on fine- and coarse-grained features extracted from code. Using this information, the most optimized algorithm for the target architecture is then substituted. Results obtained by using Amoeba, a framework that incorporates this methodol- ogy, on codes obtained from the public domain are presented. General Terms: Automatic Parallelization, Parallelization Tools Keywords: Intent-based compilation, automation, code-to-code transformers, parallelization, parallel compilers 1. INTRODUCTION With the increased pervasiveness of massively parallel GPUs, ac- celerators and FPGAs on general-purpose commodity computers, parallel computing is no longer restricted to elitist machines [1]. The popularity of clusters made with off-the-shelf components, the increasing ratio of number of cores per processor die and the pres- ence of multiple processors on single machines, will have a large impact on the parallel programming community. Parallel program- ming that was once restricted to the HPC community, will soon involve the mainstream programmers in fields as diverse as Em- bedded Systems, Browser development, Game Programming and Operating Systems for smart phones, tablets, netbooks and game consoles. The next sub sections highlight the effects of these trends on automatic parallelization. 1.1 Software and Software compiler Requirements In the past, each generation of hardware brought increased perfor- mance for existing applications and a code rewrite was not needed [2]. With the pervasiveness of diverse and specialized computing architectures in HPC, high-end servers, Multi-processor System- on-Chips (MPSoCs), Laptops and mobile platforms, code porta- bility will soon become a major challenge. The responsibility of ensuring scalable, portable and efficient parallel programming for these specialized architectures will rest on the application develop- ers and on the suite of tools available. The HPC community relies on a large base of legacy sequential code for its scientific computation. Parallelizing such applications for even a single architecture is a complex exercise that incorpo- rates both domain expertise and sophisticated programming skills. In many cases, these applications are executed on various platforms during their lifetime. This makes the task of debugging, testing, porting, maintenance and versioning of code for these applications challenging. 1.2 Automatic versus manual parallelization In spite of decades of research in parallel development tools (au- tomatic compilers, code-to-code transformers, parallel debuggers, auto tuners and parallel development environments collectively re- ferred to as parallel tools or parallel development environments in the rest of this paper) manual parallelization still continues. One main reason is that the automatically generated code, in the gen- eral case, can never be as efficiently optimized for execution on a particular architecture as hand-programmed code [3]. Indeed, the ability of a specialized human programmer to make complex code transformations judiciously by intuition and experi- ence clearly defines the path that future parallel tools should take. This intelligent human-factor is seldom incorporated in automatic compilers. 1.3 Loops and algorithms The choice of algorithms used in a program heavily influences the efficiency of the final application. To illustrate, consider two sequential algorithms A 1 and A 2 that consists of O(n 3 ) and O(n 2 logn) operations, respectively, available to solve a particu- lar problem. A sequential implementation will prefer the use of the second more efficient algorithm. An automatic parallelization tool will work on the premise that this is the best possible imple- mentation. The ease and the degree of parallelization is not con- 1