IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 25, NO. 4, APRIL 2017 1271
A 16-Core Voltage-Stacked System With Adaptive
Clocking and an Integrated Switched-Capacitor
DC–DC Converter
Sae Kyu Lee, Student Member, IEEE, Tao Tong, Student Member, IEEE, Xuan Zhang, Member, IEEE,
David Brooks, Fellow, IEEE , and Gu-Yeon Wei, Member, IEEE
Abstract— This paper presents a 16-core voltage-stacked
system with adaptive frequency clocking (AFClk) and a fully
integrated voltage regulator that demonstrates efficient on-chip
power delivery for multicore systems. Voltage stacking alleviates
power delivery inefficiencies due to off-chip parasitics but
adds complexity to combat internal voltage noise. To address
the corresponding issue of internal voltage noise, the system
utilizes an AFClk scheme with an efficient switched-capacitor
dc–dc converter to mitigate noise on the stack layers and
to improve system performance and efficiency. Experimental
results demonstrate robust voltage noise mitigation as well as the
potential of voltage stacking as a highly efficient power delivery
scheme. This paper also illustrates that augmenting the hardware
techniques with intelligent workload allocation that exploits the
inherent properties of voltage stacking can preemptively reduce
the interlayer activity mismatch and improve system efficiency.
Index Terms—Adaptive frequency clocking (AFClk), dc–dc
converter, multicore, power delivery, voltage noise, voltage
stacking.
I. I NTRODUCTION
E
FFICIENT power delivery is a critical design target
for modern computing systems from high-performance
servers to mobile devices. Continued decreases in supply
voltages and aggressive power reduction techniques (e.g.,
clock and power gating) under a fixed power budget have led
to increases in average current draw and worsening current
transients, forcing stringent requirements on the power deliv-
ery impedance. Today’s 100-W high-performance processors
operate under 1 V, draw excess of 100 A, and require <1-m
impedance for 10% voltage noise margin, which is extremely
challenging to achieve. Furthermore, significant I
2
R power
Manuscript received June 1, 2016; revised October 2, 2016; accepted
November 3, 2016. Date of publication December 19, 2016; date of current
version March 20, 2017. This work was supported in part by NSF under
Grant CCF-0903437 and Grant CCF-1218298, and in part by DARPA
under Grant HR0011-13-C-0022. This paper was recommended by Associate
Editor I. Savidis.
S. K. Lee, T. Tong, D. Brooks, and G.-Y. Wei are with the School of
Engineering and Applied Sciences, Harvard University, Cambridge,
MA 02138 USA (e-mail: saekyu@eecs.harvard.edu; taotong@seas.
harvard.edu; dbrooks@eecs.harvard.edu; guyeon@eecs.harvard.edu).
X. Zhang was with the School of Engineering and Applied Sciences,
Harvard University, Cambridge, MA 02138 USA. He is now with
Washington University in St. Louis, St. Louis, MO 63130 USA (e-mail:
xuan.zhang@wustl.edu).
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TVLSI.2016.2633805
loss in the off-chip parasitic resistance of the power delivery
network can greatly degrade the overall system efficiency,
while electromigration due to high current levels is a concern.
Exacerbating the issue, the off-chip components ostensibly
have not scaled, contrary to the ever-decreasing power delivery
impedance requirements needed to keep up with the high
current demands of modern computing systems.
Voltage stacking is an on-chip power delivery solution that
delivers a high voltage to the chip by vertically stacking
voltage domains in series and recycling charge through the
stacked layers, thereby reducing the overall chip current
demands [1]–[11]. For the same chip power, an n-way stacked
system reduces the current draw of the chip proportionally
by n, which reduces IR drop by factor of n,I
2
R power loss
by n
2
, and alleviates the off-chip impedance requirements
for the same voltage noise margin. It also obviates a high
step-down off-chip dc–dc converter, which improves off-chip
regulator efficiency.
There has been growing interest in integrating dc–dc con-
verters on-chip to perform voltage conversion from the high
input voltage levels provided to the chip [12]–[15]. However,
such integration usually suffers from inferior conversion effi-
ciency due in part to poor quality inductors and capacitors
available on-chip. By stacking voltage domains and obviating
the explicit voltage conversion stage, voltage stacking achieves
high efficiency power delivery. Ideally, if the power consump-
tion of all stacked layers perfectly matches, the layer voltages
evenly subdivide the high input voltage, and voltage stacking
achieves optimal intrinsic step-down voltage conversion with
no loss. In practice, interlayer switching activity mismatch
exists and results in interlayer voltage noise due to the series-
connected nature of voltage stacking; wherein, this suscepti-
bility to voltage noise negatively impacts system performance
and energy efficiency, and poses system reliability concerns.
To address the corresponding voltage noise issue associated
with voltage stacking, this paper presents a 16-core four-way
voltage-stacked test chip implemented in TSMC’s 40 G
process that integrates industry grade microprocessor cores
with a multioutput integrated voltage regulator (IVR) and
implements adaptive frequency clocking (AFClk) to mitigate
the impact of voltage noise. The IVR provides minimum
voltage guarantees while AFClk allows the cores to operate
efficiently with minimal margin. With voltage stacking,
1063-8210 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.