User-Friendly Methodology for Automatic Exploration of Compiler Options: A Case Study on the Intel XScale Microarchitecture Haiping Wu Eunjung Park Long Chen Juan del Cuvillo Guang R. Gao University of Delaware Department of Electrical and Computer Engineering Newark, Delaware 19716, U.S.A {hwu, epark, lochen, jcuvillo, ggao}@capsl.udel.edu Abstract Finding an optimized combination of compiler options that benefits the most a given embedded application is a challenge for most application developers. It is only with a deep understanding of the application at hand and a fairly good knowledge of the compiler features that a program- mer can achieve the desired results in terms of performance, power consumption and code size out of an application. We have developed a practical methodology for auto- matic exploring compiler options (UMECO) to solve the problem mentioned above. This paper reports a case study of this methodology on the Intel XScale microarchitecture. Based on practical experimentation, we enhance KCC, our research compiler infrastructure, with an extended user in- terface that the users can provide advice to. All are con- trolled by a reduced set of compiler flags. We also demon- strate a compiler trade-off strategy based on experimental results for a set of well known embedded benchmarks. Keywords: Compiler option, Performance, Power, Code- size, Microarchitecture 1. Introduction The specific features and requirements of embedded ap- plications bring about a new challenge to the traditional compiler design methodology. These challenges have re- sulted in numerous studies that focus on improving com- piler technology to meet the specific requirements of em- bedded systems [3, 4, 7, 9]. Nowadays, modern compilers for embedded systems support irregular microarchitectures (i.e., digital signal pro- cessors, network processors, micro-controller units), com- plex instruction sets (i.e., application specific instruction sets) and capture architecture specific optimization features (i.e., parallelism in multi-core or multi-function units). To handle such a broad spectrum of possibilities, compil- ers supply a large number of optimization options. Unfortu- nately, it is the application developers’ responsibility to find a suitable combination of options for each specific appli- cation. Meanwhile, finding a combination of compiler op- tions such that the compiled program meets the specifica- tions is not a trivial task. The problem becomes even more complicated when a trade-off between performance (exe- cution time), power consumption and code size (bytes of the text section) is added to the list of requirements. From the application developers’ perspective, it would be desir- able to have a simple compiler-user interface that based on an application profile could come up with an optimal com- bination of compiler options, allowing developers to con- centrate on other aspects of the development. Therefore, there is a strong requirement moving the study toward the methodologies that find an optimal com- bination of compiler options for different applications run- ning on a specific architecture. We have developed a practical methodology for auto- matic exploring the compiler options (UMECO) [6] which can be applied directly to embedded applications. The strat- egy behind the methodology is to first find an optimized combination of compiler options by experimental measure- ment for a set of typical applications. Armed with this knowledge, the compiler has then the ability to automati- cally make a good trade-off between performance, power consumption and code size for applications in the same do- main. This paper is a progress report on part of the UMECO work on the Intel XScale microarchitecture, for which we setup a hardware testbed using the Intel XScale 80200 Eval-