Achieving Predictable Execution in COTS-based Embedded Systems Stanley Bak , Rodolfo Pellizzoni , Emiliano Betti , Gang Yao , John Criswell , Marco Caccamo , Russel Kegley University of Illinois at Urbana-Champaign, USA , University of Waterloo, Canada , Lockheed Martin Corp., USA Abstract—Building safety-critical real-time systems out of inex- pensive, non-real-time, Commercial Off-the-Shelf (COTS) compo- nents is challenging. Although COTS components generally offer high performance, they can occasionally incur significant timing spikes. To prevent this, we propose controlling the operating point of shared resources, for example main memory, to maintain it below its saturation limit. This is necessary because the low-level arbiters of these shared resources are not typically designed to provide real-time guarantees. Here, we discuss a novel system execution model, the PRedictable Execution Model (PREM), which, in contrast to the standard COTS execution model, coschedules, at a high level, components in the system which may access main memory, such as CPUs and I/O peripherals. To enforce predictable, system-wide execution, we argue that real- time embedded applications should be compiled according to a new set of rules dictated by PREM. To experimentally validate the proposed theory, we developed a COTS-based PREM testbed and modified the LLVM Compiler Infrastructure to produce PREM-compatible executables. I. PREDICTABLE EXECUTION MODEL (PREM) Building computer systems out of commercial off-the-shelf (COTS) components, as opposed to custom-designed parts, typically improves time-to-market, reduces system cost, while providing generally better performance. For real-time systems, however, one hurdle in the way of using COTS is transient timing spikes which may occur when there is contention for shared resources. The low-level arbiter of shared resources in a COTS system typically does not have a mechanism to deal with the timeliness aspects of incoming requests, which may end up delaying more critical tasks, causing an unintended and undesirable priority inversion. The PRedictable Execution Model (PREM) [1], in contrast to the standard COTS execution model, coschedules at a high level all active components in the system, such as CPU cores and I/O peripherals. Briefly, the key idea is to control when active components access shared resources so that contention for accessing shared resources is implicitly resolved by the high-level coscheduler without relying on low-level, non-real- time arbiters. Here, we specifically focus our attention on contention at the level of the interconnect and main memory. <8= $%&'()*" <:'>& .&3'>&2 :", /&?-:'&4&"32 8&/9?>&/:- ,:3: 3/:"2.&/2 4&4*/1 ?>:2& &%&'()*" ?>:2& Fig. 1. A predictable interval is composed of 2 main phases: memory phase and execution phase. Peripherals are allowed to access the bus only during the execution phase A. Scheduling Memory Access In order to schedule access to the main memory at a high level, we propose the PREM execution model, where tasks running on the CPU are logically divided into two types of intervals: compatible intervals and predictable intervals. The compatible intervals are compiled and executed without further modifications; they are backwards compatible, but they should nonetheless be minimized in order to provide a good level of resource (main memory) utilization. Predictable intervals, on the other hand, are split into two phases: a memory phase and an execution phase. During the memory phase, the task can access main memory, typically loading the cache with memory which will be accessed during the rest of the interval. During an execution phase, no further main memory access is performed. Each interval executes for a fixed amount of time equal to its worst-case execution time, which simplifies system-wide scheduling. With PREM, peripheral access to main memory is also controlled. Peripherals will only access main memory when a task is in its execution phase. In this way, only one active component at a time accesses main memory, avoiding the effects of the non-real-time interconnect and memory arbiters. Figure 1 shows an example of a single predictable interval. A more complex scenario is shown in Figure 2, where two tasks (τ 1 and τ 2 ) run together with two related peripheral I/O flows (τ I/O 1 and τ I/O 2 ). In this case, the input and output for each task is done in the adjacent I/O periods (double