Software Engineering for Multicore Systems – An Experience Report Victor Pankratius University of Karlsruhe 76131 Karlsruhe, Germany pankratius@ipd.uka.de Christoph Schaefer University of Karlsruhe 76131 Karlsruhe, Germany cschaefer@ipd.uka.de Ali Jannesari University of Karlsruhe 76131 Karlsruhe, Germany jannesari@ipd.uka.de Walter F. Tichy University of Karlsruhe 76131 Karlsruhe, Germany tichy@ipd.uka.de ABSTRACT The emergence of inexpensive parallel computers powered by multicore chips combined with stagnating clock rates raises new challenges for software engineering. As future perfor- mance improvements will not come “for free” from increased clock rates, performance critical applications will need to be parallelized. However, little is known about the engineering principles for parallel general-purpose applications. This paper presents an experience report with four diverse case studies on multicore software development for general- purpose applications. They were programmed in different languages and benchmarked on several multicore computers. Empirical findings include: • Multicore computers deliver: Real speedups are achiev- able, albeit with significant programming effort and speedups that are typically lower than the number of cores employed. • Massive refactoring of sequential programs is required, sometimes at several levels. Special tools for paral- lelization refactorings appear to be an important area of research. • Autotuning is indispensable, as manually tuning thread assignment, number of pipeline stages, size of data partitions and other parameters is difficult and error prone. • Architectures that encompass several parallel compo- nents are poorly understood. Tuneable architectural patterns with parallelism at several levels need to be discovered. Categories and Subject Descriptors D.1.3 [Programming Techniques]: Concurrent Program- Technical Report Institute for Program Structures and Data Organization (IPD) University of Karlsruhe, Germany December 2007 ming—Parallel programming ; D.2.11 [Software Engineer- ing]: Software Architectures —Patterns General Terms Experimentation, Performance, Design, Algorithms Keywords Multicore Systems, Design Patterns, OpenMP, Autotuning 1. INTRODUCTION Inexpensive multicore chips (chips with several proces- sors) are pushing parallel computing out of the relative niche of high performance computing into the mainstream. Al- ready in 2005, affordable dual-core laptops, quad-core PCs, and eight-core servers were available on the market. Largely unnoticed went the fact that Cisco, also in 2005, developed a packet routing chip with 188 (!) processors [10]. The roadmaps of the semiconductor industry predict several hun- dreds of cores per chip in future generations [25, 30]. This development presents an opportunity that the software in- dustry cannot ignore. The bad news is that the era of doubling performance every 18 months has come to an end [23]. This means that the implicit performance improvement “for free” with every chip generation has also ended. Thus, future performance gains, required for new or improved applications, will have to come from parallelism. Unfortunately, one cannot rely solely on compilers to per- form the parallelization work [6], as the choice or paralleliza- tion strategy has a significant impact on performance and often requires massive program refactorings. Software engi- neering now faces the problem of developing parallel appli- cations, while keeping cost and quality of software constant [6]. This paper takes stock of the current situation in mul- ticore programming and suggests areas for future research and development. What are the tools and techniques we have right now to develop general-purpose software for mul- ticore systems? What are the problems and difficulties? Is multicore programming worth the additional effort? Where do we need extensions and future research? To answer these questions, we conducted four case studies with applications from different areas, written in different programming lan-