CVP: An Energy-Efﬁcient Indirect Branch Prediction with Compiler-Guided Value Pattern Mingxing Tan, Xianhua Liu, Dong Tong, Xu Cheng Microprocessor Research and Development Center, Peking University, Beijing, China {tanmingxing, liuxianhua, tongdong, chengxu}@mprc.pku.edu.cn ABSTRACT Indirect branch prediction is becoming increasingly impor- tant in modern high-performance processors. However, pre- vious indirect branch predictors either require a signiﬁcant amount of hardware storage and complexity, or heavily rely on the expensive manual proﬁling. In this paper, we propose the Compiler-Guided Value Pat- tern (CVP) prediction, an energy-eﬃcient and accurate in- direct branch prediction via compiler-microarchitecture co- operation. The key of CVP prediction is to use the compiler- guided value pattern as the correlated information to hint the dynamic predictor. The value pattern reﬂects the pat- tern regularity of the value correlation, and thus signiﬁcant- ly improves the prediction accuracy even in the case of deep pipeline stage or long memory latency. CVP prediction relies on the compiler to automatically identify the primary value correlation based on three high-level program substructures: virtual function calls, switch-case statements and function pointer calls. The compiler-identiﬁed information is then fed back to the dynamic predictor and is further used to hint the indirect branch prediction at runtime. We show that CVP prediction can be implemented in modern processors with little extra hardware support. Evaluations show that CVP prediction can signiﬁcantly improve the prediction accuracy by 46% over the traditional BTB-based prediction, leading to the performance improve- ment of 20%. Compared with the state-of-the-art aggressive ITTAGE and VBBI predictors, CVP prediction can improve the performance by 5.5% and 4.2% respectively. Categories and Subject Descriptors C.1.0 [Processor Architectures]: General; C.5.3 [Computer System Implementation]: Microcom- puters—Microprocessors Keywords Indirect branch prediction, compiler-guided value pattern. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full citation on the ﬁrst page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior speciﬁc permission and/or a fee. ICS’12, June 25–29, 2012, San Servolo Island, Venice, Italy. Copyright 2012 ACM 978-1-4503-1316-2/12/06 ...$10.00. 1. INTRODUCTION In modern high-performance processors, accurate branch prediction can improve both the performance and energy eﬃciency. As indirect branches are widely used in modern applications [10, 21], an accurate indirect branch predictor is becoming an important component of modern processors [14]. Unfortunately, indirect branches are hard to predict be- cause each indirect branch may correspond to multiple tar- gets [18, 21]. As a special kind of indirect branches, function return instructions can be predicted using return-address- stack [19], but other indirect branches used for virtual func- tion calls, switch-case statements, and function pointer calls are hard to predict. As a result, indirect branch mispredic- tions usually make up a sizable fraction of overall branch mispredictions [18, 21]. We evaluate the Mispredictions Per Kilo Instructions (MPKI) for conditional branches, function returns and indirect branches 1 in a 4-issue, 16-stage proces- sor with a 4-way, 4K-entry Branch Target Buﬀer (BTB). Figure 1 shows the results. On average, the indirect branch MPKI accounts for 42% of total mispredictions. perlbmk perlbench gap gcc00 gcc06 eon sjeng crafty ixx richards AVG 0 2 4 6 8 10 12 14 Indirect branches Function returns Conditional branches Mispredictions Per Kilo Instructions (MPKI) Figure 1: MPKI for diﬀerent kinds of branches. Previous indirect branch predictors mainly require a sig- niﬁcant amount of extra hardware storage and complexity. History-based predictors such as TTC predictor [5], Cascade predictor [7] and ITTAGE predictor [28] generally require large extra storage; while value-based ARVI predictor [6] and VBBI predictor [11] heavily rely on the complex control logics or manual proﬁling. These predictors mainly focus on hardware-only approach, because they hope to be read- ily deployed into existing architectures and maintain bina- 1 In the rest of this paper, an “indirect branch” refers to a non-return unconditional indirect branch instruction. 111