300 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 56, NO. 4, APRIL 2009 Data Bus Inversion in High-Speed Memory Applications Timothy M. Hollis, Member, IEEE Abstract—Efforts to reduce high-speed memory interface power have led to the adoption of data bus inversion or bus-invert coding. This study compares two popular algorithms, which seek to limit the number of simultaneously transitioning signals and bias the state of transmitted data toward a preferred binary level, respec- tively. A new algorithm, which provides a compromise between transition frequency and preferred signal level, is proposed, and the three algorithms are compared in terms of their impact on power consumption, power supply noise reduction, and general signal integrity enhancement when used in conjunction with a variety of link topologies. Index Terms—Bus-invert coding, data bus inversion (DBI), single-ended signaling, transmission line termination. I. I NTRODUCTION W ITH the growing popularity of mobile consumer elec- tronics comes a new memory interface paradigm, wherein the milliwatt-per-gigabit-per-second metric takes pre- cedence over brute-force bandwidth enhancement. In fact, the emphasis on minimizing power consumption in mobile appli- cations has kept data rates between integrated circuits relatively low, while a simultaneous push for a reduced form factor has led to shorter onboard signal routing and a variety of 3-D (stacked- die) configurations, in which the total channel length may be less than a few millimeters. These extremely short channel lengths, coupled with modest data rates, allow input/output (I/O) designers to shift focus from the channel to other aspects of the interface design. One popular technique, initially proposed to reduce signaling power in the single-ended interface, is data bus inversion (DBI) or bus-invert coding, in which the state of the data to be transmitted may or may not be inverted prior to transmission in accordance with a predetermined encoding algorithm [1]–[3]. The most familiar DBI algorithms, originally titled “limited- weight coding” and “bus inversion,” have come to be known as DBI-DC and DBI-AC, respectively, as one manipulates signal- level probabilities (dc) while the other targets signal transition frequency (ac) [4], [5]. Although theoretical data have been published on these techniques [1]–[3], there has been an unequal amount of silicon-based validation [4], [5]. This brief assesses the impact Manuscript received August 14, 2008; revised November 3, 2008. First published March 16, 2009; current version published April 17, 2009. This paper was recommended by Associate Editor M. Anis. The author is with Micron Technology, Inc., Boise, ID 83716 USA (e-mail: thollis@ieee.org). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCSII.2009.2015395 of DBI for common I/O topologies and provides measured data points from Micron’s GDDR4 2.5-Gb/s graphics dynamic RAM (DRAM) [6]. A new algorithm, incorporating the fa- vorable characteristics of both DBI-DC and DBI-AC, is also presented and analyzed in light of various termination schemes. The sections that follow lay out the current state of DBI, identify memory-specific challenges that may dictate the choice of DBI realization for a given system, propose a new combi- national DBI algorithm (DBI-AC/DC), and compare the rela- tive performance of the three algorithms (DBI-DC, DBI-AC, and DBI-AC/DC) when applied to a variety of interface configurations. II. PRIOR ART DBI algorithms evaluate parallel data bits in the transmission queue, and depending on the state or relative state (e.g., the state of a current bit relative to a past bit) of the data to be sent, a decision is made to invert or not invert a portion or all of the bits prior to transmission. A DBI flag is sent in parallel with the data over a dedicated channel to notify the receiver that the data have or have not been inverted. The DBI-DC algorithm, mentioned previously, considers the current state of the data to be transmitted and chooses to invert or not invert the parallel data, with the underlying goal of minimizing either the number of “ones” or “zeros” simultaneously transmitted [1], [3]. Alternatively, the DBI-AC algorithm compares the current state of the parallel data with the state of the immediately preceding data and chooses to invert or not invert the current data group, with the underlying goal of minimizing the number of simultaneously transitioning signals across the width of the bus [2]. III. MEMORY-SPECIFIC I MPLEMENTATION Although the initial motivation for adopting DBI into mem- ory systems was to reduce signaling power, realization of the technique within the GDDR4 standard uncovered additional ways in which interfaces benefit from DBI encoding [4]. Two such benefits are shown in Fig. 1, which presents measured data taken from Micron’s GDDR4 high-speed graphics DRAM operating at 2.5 Gb/s. In both cases, the left image shows the measurement (data eye or supply noise) under nominal operation (no DBI), while the right image shows the same measurement with DBI-DC applied. As demonstrated by the data eye width and peak-to-peak supply noise measurements reported in the figure, application of the DBI-DC algorithm to the graphics memory link can increase timing margins by as 1549-7747/$25.00 © 2009 IEEE Authorized licensed use limited to: Hanyang University. Downloaded on June 16,2020 at 00:53:05 UTC from IEEE Xplore. Restrictions apply.