Area and Energy-Efficient Crosstalk Avoidance Codes for On-Chip Buses Srinivasa R. Sridhara, Arshad Ahmed, and Naresh R. Shanbhag Coordinated Science Laboratory/ECE Department University of Illinois at Urbana Champaign 1308 W Main St. Urbana IL 61801 {sridhara,ahmed4,shanbhag}@uiuc.edu Abstract Capacitive crosstalk between adjacent wires in long on- chip buses significantly increases propagation delay in the deep submicron regime. A high-speed bus can be designed by eliminating crosstalk delay through bus encoding. In this paper, we present an overview of the existing coding schemes and show that they require either a large wiring overhead or complex encoder-decoder circuits. We propose a family of codes referred to as overlapping codes that re- duce both overheads. We construct two codes from this fam- ily and demonstrate their superiority over existing schemes in terms of area and energy dissipation. Specifically, for a 1-cm 32-bit bus in 0.13-μm CMOS technology, we present a 48-wire solution that has 1.98× speed-up, 10% energy savings and requires 20% less area than shielding. 1 Introduction On-chip global buses are increasing in length with in- creasing die sizes, resulting in large propagation delays [1]- [3]. The delays of these buses can limit the system per- formance in many high-speed microprocessors [2, 3]. This trend is anticipated to worsen in the future due to the in- creasing gap between gate delay and interconnect delay brought about by shrinking feature sizes. In deep sub- micron (DSM) era, the coupling capacitance is significant compared to the bulk capacitance. Hence, the capacitive crosstalk due to the transitions on adjacent wires leads to a significant increase in the worst-case delay [4]-[7]. This in- crease in the delay is referred to as the crosstalk delay. Cod- ing techniques [5]-[7] have been proposed to avoid crosstalk delay. Coding is the process of mapping information bits or data words into codewords such that the codewords exhibit certain desired properties. In order to prevent crosstalk de- lay, any two codewords following one another on the bus should not have transitions that incur the crosstalk delay penalty. This can be achieved by either avoiding specific data patterns [6] or avoiding opposing transitions on adja- cent wires [7]. However, for large buses, it is impractical to encode all bits at once due to the prohibitive complexity of the encoder-decoder (codec) circuits. Therefore, partial coding [6, 7] is employed in which the bus is broken into sub-buses of smaller width which are encoded into sub- channels. These sub-channels are then combined in such a way so as to avoid crosstalk delay at their boundaries. This recombination requires additional wires [6, 7] and ad- ditional codec delay [6]. In this paper, we present overlapping codes. In this par- tial coding technique, adjacent sub-channels are overlapped in order to obtain compact buses for a given data rate. While such a scheme reduces the wiring overhead and codec com- plexity of combining sub-channels, it places additional re- strictions on the component partial codes. Depending on the partial codes used, we show that these restrictions can be satisfied either by reducing the code rate or by using mem- ory to track the state of the partial code. We construct two codes using this technique and show that at a given through- put, the wiring and the computational overheads are reduced compared to existing schemes. 2 Bus Models In this section, we review the analytical models for de- lay and energy dissipation in DSM buses. In this paper, we assume an n-bit parallel bus in a single metal layer. Further, we assume that rise time of the drivers and the loss in the interconnects are such that the inductance can be safely ig- nored [3]. Such DSM buses can be modeled as distributed RC networks with coupling capacitance between adjacent wires. 2.1 Delay Model The delay of line l (1 < l < n) of the bus is given by [5] T l = τ 0 (1 + 2λ)∆ 2 l − λ∆ l (∆ l -1 + ∆ l +1 ) , (1) where τ 0 is the delay of a crosstalk-free line, λ is the ratio of the coupling capacitance to the bulk capacitance, and ∆ l is the transition occurring on line l . ∆ l is equal to 1 for 0-to-1 transition, -1 for 1-to-0 transition, and 0 for no transition. 2.2 Energy Model The average dissipated energy per bus transfer depends on the statistical distribution of the data and is given by [8] E = tr ( C T A ) V 2 dd , (2)