Early Branch Prediction Circuit for High Performance Digital Signal Processors Aamir A. Farooqui 1 , Vojin G. Oklobdzija 2 1 Department of Electrical and Computer Engineering, University of California, Davis, CA 95616. e-mail: aamirf@ece.ucdavis.edu. 2 Integration Berkeley, California. email: vojin@nuc.berkeley.edu. http://www.integr.com/ Abstract In this paper, design and VLSI Implementation of an Early Branch Prediction (EBP) circuit, based on a variation of Carry Look-ahead scheme is presented. The key features of this design are low area, high speed (2 log n/2 + 1), and high modularity. This design out performs all the EBP designs presented so far. For 64-bit word length the early branch prediction is obtained in 679 ps as simulated for 0.2- μm technology under typical conditions. Simulation and layout results for 0.2-μm CMOS technology show a 30% increase in speed with 25% decrease in area as compared, to recently published results. 1. Introduction Handling of the conditional branches is an important issue in high-performance computer design. Conditional branch instructions create a “critical path” in many processors. The reason is that the evaluation of the condition (true or false) takes additional time in addition to the execution of branch instruction. Many attempts were made for the early branch prediction (EBP) [1,2,5,6,8]. Recently, David et. al. [1] have proposed a circuit based on Prefix-And method. They claim it as the fastest possible circuit, with a delay of log n+3 (where n is the number of bits). In this paper, we propose a new scheme for EBP which requires a delay of 2 log n/2 + 1, with minimum hardware. In this design only one circuit is used to evaluate the Greater than (GT), Less than (LT) and Equal to Zero (ETZ) conditions. In all the EBP presented so far, one circuit is used for the evaluation of GT and LT condition each, and a separate XOR-AND tree for the evaluation of ETZ condition. In order to make a fair comparison between the proposed circuit and a recently published design [1] the two circuits were laid-out in a single chip for comparison purposes. 2. Architecture Let A = (a n-1 , …, a 0 ) and B = (b n-1 , …, b 0 ) be the two operands to be compared, with a n-1 and b n-1 be the sign bits. The three conditions to be tested are A>B (GT), A<B (LT), and A=Z (ETZ). The conditions A≥B and A<B can be evaluated by subtracting B from A and looking at the carry out of the result, and the sign bits of A and B (see Table 1). The detection of carry out signal requires the delay of a carry-propagate adder. In this paper we have used a modified Carry Look-ahead scheme. It is based on the fact that the carry generated (G) at bit position i ( b a G i i i • = , where • is the AND operation) can be dropped in the computation of the group carry, if the two input bits at any position j (j > i) are zero. We call this function NZ (no zero), and it’s value at bit position i is, i i i b a NZ + = , where + is the OR operation. A n B n Condition 0 1 A>B 1 0 A<B 0 0 A≥B if C out = 1, else A<B 1 1 A≥B if C out = 1, else A<B Table 1. Detection of A≥B, and A<B, based on An-1, Bn-1, and Cout. Since, the functions G i and NZ i (i= n-1 to 0) can be generated simultaneously from A and B, the group carry C out can be computed in 3 logic levels (if there is no limitation on fan in/out). C out = NZ n-1 •……•NZ 2 •NZ 1 •G 0 +NZ n-1 •… … • NZ 2 • G 1 + …… + G n-1 (1) Using NZ and G functions we can evaluate ETZ condition as follows: ( 0 1 NZ NZ + … … + - n ) • ( 0 1 G G n • ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ • - ) = ( 29 a b n n - - ⊕ 1 1 •……• ( 29 a b 0 0 ⊕ (2)