WCET-aware Scheduling Optimizations for Multi-Core Real-Time Systems Timon Kelter TU Dortmund University Otto-Hahn-Strasse 16 44227 Dormund, Germany Email: timon.kelter@tu-dortmund.de Hendrik Borghorst TU Dortmund University Otto-Hahn-Strasse 16 44227 Dormund, Germany Email: hendrik.borghorst@tu-dortmund.de Peter Marwedel TU Dortmund University Otto-Hahn-Strasse 16 44227 Dormund, Germany Email: peter.marwedel@tu-dortmund.de Abstract—In real-time systems, the WCET (worst-case exe- cution time) of tasks is of utmost importance. For multi-cores, the WCET has been shown to be hard to determine due to task interactions on shared memory and shared buses. This problem is usually addressed by spatial or temporal partitioning of the resources, but both lead to lower utilization if the partitioning is not done optimally. We examine two approaches for optimizing resource usage in a temporally partitioned multi-core system and show that these techniques can reduce the WCET by more than 30% on average, leading to better schedulability and higher system utilization. I. I NTRODUCTION Most of today’s high-performance processors are multi- cores, not only in the desktop and server but also in the embedded systems market. Though the increased overall com- putational power is beneficial to the average-case application, multi-cores pose a fundamental problem for safety-critical real- time applications. Since some of the hardware components in a multi-core system are shared between cores, tasks that execute on different cores may interfere with each other during accesses to shared components. This breaks the isolation between tasks and makes their worst-case execution time (WCET) harder or even impossible to predict. Since the WCET is needed for schedulability analysis and certification of safety- critical systems, the current industrial practice is to deactivate all but a single core to bring the system back into a predictable state [1]. To overcome this unsatisfactory state, it is necessary to know how the shared resources are arbitrated among contend- ing requests from multiple cores. Also, a precise definition of the timing behavior of this arbiter must be given. Recent publications have discussed the implications of different types of arbitration methods on the achievable anal- ysis precision [2]. Time-triggered arbitration methods were found to be suited best for tight WCET estimation, but their performance is highly dependent on their parameterization and on the structure of the examined programs. To overcome these problems, we present two novel opti- mizations that can significantly improve the WCET but also the average-case execution time (ACET) of programs running on timing-predictable multi-cores. The first is an evolutionary optimization of the shared resources’ schedule parameters, whereas the second is a multi-core WCET-aware instruction scheduling which re-structures the input programs to increase their performance on a given time-predictable multi-core plat- form. Both optimizations result in lower WCETs, which in turn leads to improved schedulability and increased resource utilization for multi-core real-time systems. In Section II we will given on overview on existing approaches and related work and Section III introduces the system model that we use for the experiments. Sections IV and V present the aforementioned novel optimizations and Section VI closes the paper with a summary and directions for future work. II. RELATED WORK The standard approach to WCET analysis [3] has recently been extended towards the analysis of multi-core systems [4], [5], [2], which makes it possible for us to consider multi-core WCET as an optimization target. The optimization of bus schedules has been the topic of a range of previous publications, but the vast majority either is restricted to TDMA schedules or uses ad-hoc WCET computations instead of an analyzer following established design principles [3]. The optimization in [6] and [7] by the same authors is based on search heuristics (simulated annealing) and is similar to our evolutionary optimization in this respect. It also integrates system-wide task scheduling with optimization, but on the other hand, it is restricted to TDMA schedules, whereas we also consider more flexible schedule variants. TDMA slot length allocation is also done in [8], but the employed WCET analysis framework is less precise and it is again restricted to TDMA. Concerning the employed evolutionary variation operators we use a similar approach as [9], but [9] is restricted to TDMA and considers the optimization at a far more coarse-grained level, i.e. the scheduling of tasks as a whole. Finally, [10] also examines bus schedule optimization, but only for the special case of Harmonic Round-Robin schedules and for additive WCET models. The majority of previous publications on WCET-aware instruction scheduling is focused on optimizing the WCET of a single-core system [11], [12]. As an exception, [13] discusses several access models for time-predictable multi-cores on an abstract level, but requires manual restructuring of the tasks. In contrast, the instruction scheduler, presented in this paper, can be used to automatically implement these models on a micro-architectural scale.