Performance analysis of the static use of locking caches. A. MARTÍ CAMPOY, A. PERLES, S. SÁEZ, J.V. BUSQUETS-MATAIX Departamento de Informática de Sistemas y Computadores Universidad Politécnica de Valencia E-46022 Valencia SPAIN http://www.upv.es Abstract: The unpredictable behavior of conventional caches presents several problems when used in real-time multitask systems. It is difficult to know its effects in the Worst Case Execution Time and it introduces additional delays when different tasks compete for cache contents in multitask systems. This complexity in the analysis may be reduced using alternative architectures to cache memories, that improves predictability but obtaining similar performance. This is the case of locking caches, that may preload and lock cache contents, precluding the replacement during system operation, thus making cache and system behavior more predictable by means of simple, well-known and easy-to-use algorithms. This work presents an analysis of worst-case performance obtained with the static use of locking caches versus worst-case performance obtained with conventional (non-locking) caches. Analysis results show that predictability can be reached with no loss of performance, and the scenarios where the locking cache provides similar performance to conventional caches may be estimated from system parameters like task size, cache size, level of locality and level of interference, without running any experiment. Key-words: real-time systems, cache memories, locking cache, schedulability analysis, performance. 1 Introd uction Modern microprocessors include cache memories in their memory hierarchy to increase system performance. General-purpose systems benefit directly from this architectural improvement, but hard real-time systems need additional hardware resources and/or system analysis to guarantee the time correctness of the system's behavior when cache memories are present. In multitask, preemptive real- time systems, the use of cache memories presents two problems. The first problem is to calculate the Worst Case Execution Time (WCET) due to intra-task or intrinsic interference. Intra-task interference occurs when a task removes its own instructions from the cache due to conflict and capacity misses. When removed instructions are executed again, a cache miss increases the execution time of the task. This way, the delay caused by the cache memory interference must be included in the WCET calculation. The second problem is to calculate the task response time due to inter-task or extrinsic interference. Inter-task interference occurs in preemptive multitask systems when a task displaces the working set of any other task from the cache. When the preempted task resumes execution, a burst of cache misses increases its execution time. This effect, called cache-refill penalty or cache-related preemption delay must be considered in the schedulability analysis, since it situates task execution time over the precalculated WCET. Several solutions have been proposed for the use of cache memories in real-time systems. In [1,2,3] cache behavior is analyzed to estimate task execution time considering the intra-task interference. In [4,5] cache behavior is analyzed to estimate task response time considering the inter-task interference, using a precalculated cached WCET. The main drawback of these solutions is the complexity of the algorithm and methods needed to accomplish the analysis. Also, each method considers only one face of the problem, the intra-task interference or the inter-task interference, but not both. Alternative architectures to conventional cache memories have been proposed, in order to eliminate or reduce cache unpredictability, making the sechedulability analysis easy. In [6,7,8] hardware and software techniques are used to divide the cache memory, dedicating one or more partitions to each task, avoiding the inter-task interference. The main drawback of these proposals is the no action over the intra-task interference, leaving one side of the problem unresolved. Also, in several cases the inter-task interference is only reduced but not fully eliminated, so the inter-task problem is also unresolved. The use of locking caches has been proposed in [9,10] as an alternative to conventional caches solving both intra-task and inter-task interference analysis. The static use of locking caches fully eliminates the intra-task interference, allowing the use of simple algorithms in order to estimate the