ARCHITECTURAL APPROACHES TO REDUCE LEAKAGE ENERGY IN CACHES Shashikiran H. Tadas & Chaitali Chakrabarti Department of Electrical Engineering Arizona State University Tempe, AZ, 85287. tadas@asu.edu , chaitali@asu.edu ABSTRACT In this paper, we present two methods to reduce leakage energy by dynamically resizing the cache during program execution. The first method monitors the miss rate of the individual subbanks (in a subbanked cache structure) and selectively shuts them if their miss rate falls below a predetermined threshold. Simulations on SPECJVM98 benchmarks show that for a 64K I- cache, this method results in a leakage reduction of 17-69% for a 4 subbank structure and 18-75% for a 8 subbank structure when the performance penalty is <1%. The second method dynamically resizes the cache based on whether the macro- blocks (a group of adjacent cache blocks) are being heavily accessed or not. This method has higher area overhead but greater leakage energy reduction. Simulations on the same set of benchmarks show that this method results in a leakage reduction of 22-81% for the I-cache when the performance penalty is <0.1%, and 17-85% for the D-cache when the performance penalty is <1%. 1. INTRODUCTION In recent microprocessor designs, a large part of the die size is devoted to memory components like instruction cache, data cache, buffers and tables. For instance, 30% of Alpha 21264 and 60% of Strong Arm chip area are devoted to caches and memory components. Thus a significant part of the leakage energy on a chip is due to caches. In fact, according to recent industry estimates, for 0.13um technology, 30% of L1-cache energy and 80% of L2 cache energy is due to leakage [1]. An effective way of reducing leakage energy in caches is by dynamically resizing it during application execution. The novel cache design, referred to as the Dynamically ResIzable Instruction cache (DRI-cache) [1] virtually turns off the supply voltage to the cache’s unused sections to eliminate leakage. At the architectural level, the method exploits the variability in instruction cache usage and reduces the instruction cache size dynamically. At the circuit level, the method uses a mechanism called gated-V dd , which reduces leakage by effectively turning off the supply voltage to the SRAM cells of the cache’s unused block frames. Another approach to reduce leakage energy is based on selective-way cache organization [2]. This method exploits the subarray partitioning of set associative caches to provide the ability to disable a part of the cache. This technique allows for varying the set associativity of the cache depending on the utilization of the cache across applications, while the cache size is set prior to the application execution. A more recent approach operates at a lower granularity and selectively shuts off cache blocks when they have not been activated for a predetermined length of time [3]. This method can result in L1 leakage energy reducing by 4x for SPEC2000 benchmarks. Even greater reduction can be obtained by adapting the delay intervals during program execution. In this paper, we present two promising approaches to reduce leakage energy in cache. The first approach is based on selectively shutting off the subbanks of an instruction cache while the second approach utilizes the access pattern of the macro-blocks to resize the cache (instruction or data). While the first approach has a lower area overhead, both approaches achieve leakage energy reduction at the expense of a mild degradation in the performance. Simulations on the SPECJVM98 benchmarks show that the active portion size of the instruction cache is 32%-80% for a 4 subbank structure and 24%-78% for a 8 subbank structure when the performance penalty is < 1%. This results in a leakage energy reduction of 17-69% (average 38%) for the 4 subbank structure and 20-75% (average 52%) for the 8 subbank structure. Similar simulations on the access pattern based method show that the active portion size is 16%-67% for instruction cache when the performance penalty is 0.1% and 28%-78% for data cache when the performance penalty is 1.0%. This results in a 22- 81% leakage energy reduction in the instruction cache and 17-65% in the data cache. The rest of the paper is organized as follows. Sections 2 and 3 describe and validate the approaches based on selective disabling of the subbanks and monitoring of the access pattern of the macro-blocks. Section 4 concludes the paper. 2. DISABLING CACHE SUBBANKS 2.1 Leakage Energy Reduction mechanism The main idea behind the proposed leakage energy reduction mechanism is to selectively shut down some of the subbanks in a subbanked cache structure based on the miss rate of the individual subbank. The scheme is implemented on a 4 as well as 8 subbank instruction cache. The mechanism is explained below. The miss rate for each subbank is calculated for every one million-access intervals. If the miss rate falls below the preset threshold value, then the least accessed subbank is disabled. Actually, the entire subbank is not disabled – the top 1K of the subbank is left active for future accesses. This is referred to as ADS or activated part of the disabled subbank. Future access (either read or write) to the disabled subbank are mapped into ADS. By leaving a fraction of the subbank active (1KB), we reduce the performance degradation significantly. To map accesses to the ADS, we mask the