CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2012; 24:445–462 Published online 19 October 2011 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/cpe.1882 SPECIAL ISSUE PAPER Parallel application characterization with quantitative metrics Alexander S. van Amesfoort 1, * ,† , Ana Lucia Varbanescu 1,2 and Henk J. Sips 1 1 Dept. of Software Technology, Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands 2 Dept. of Computer Science, Vrije Universiteit, De Boelelaan 1081A, 1081 HV Amsterdam, The Netherlands SUMMARY When computer architects reinvented parallelism through multi-core processors, application parallelization became a problem. Now that multi-cores have penetrated from handhelds to supercomputers, paralleliza- tion becomes a large-scale challenge. A lot of research is going into compiler improvements, language extensions, frameworks and application/platform case studies. Whereas fairly successful, these solutions are based on experimental tools, trial-and-error, and expert knowledge, and do not bring multi-core program- ming into reach for the whole software industry. We believe that the challenge of “mass parallelization” must be tackled more systematically. Development begins at application speciﬁcation and algorithm design, fol- lowed by application characterization with trade-offs in parallelization strategies and data layouts. Only with a proper software design, implementation and optimization can start. In this article, we focus on quantitative application characterization for such a systematic approach. We introduce a set of metrics to characterize applications and show how they can be used. We present our interpretation of the results and suggest ways to use them to guide design decisions. We conclude that metrics can be used to understand applications and design decisions early on. Therefore, this characterization brings us closer to effective parallel applications development for multi-core processors. Copyright © 2011 John Wiley & Sons, Ltd. Received 14 February 2011; Revised 21 June 2011; Accepted 28 July 2011 KEY WORDS: application characterization; metric; software design; concurrency; locality 1. INTRODUCTION After decades of improvements to single-core processors, the amount of effort, power consumption, and cooling needed to maintain steady performance improvements became infeasible. Processing demands are still increasing: larger data sets need to be analyzed in a timely manner, legacy applica- tions need to run faster and use more complex models, and novel applications emerge. In an attempt to keep pace, processor architects reinvented parallelism, this time at a higher level by integrating multiple programmable cores onto a single chip. Where existing applications could take advantage of parallelization within a single core automatically or by using better compilers, it has been known from programming multi-processor systems that application programming needs to change drasti- cally to take advantage of multiple cores. Even worse, conventional (symmetric) multi-processor systems only scaled to four or eight processors, whereas the chip industry now counts on scal- ing the amount of (heterogeneous) cores far beyond that. To come close to peak performance, all parallelism layers with different granularities and concurrency levels must be fully used. In addi- tion, going multi-core does not alleviate the increasing difference between processor and memory performance, the “memory wall”. *Correspondence to: Alexander S. van Amesfoort, Dept. of Software Technology, Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands. † E-mail: a.s.vanamesfoort@tudelft.nl Copyright © 2011 John Wiley & Sons, Ltd.