This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE JOURNAL OF SOLID-STATE CIRCUITS 1
A 1.02-pJ/b 20.83-Gb/s/Wire USR Transceiver
Using CNRZ-5 in 16-nm FinFET
Armin Tajalli , Senior Member, IEEE, Mani Bastani Parizi, Member, IEEE, Dario Albino Carnelli, Chen Cao,
Kiarash Gharibdoust , Davide Gorret, Amit Gupta, Christopher Hall, Ahmed Hassanin, Klaas L. Hofstra,
Brian Holden, Ali Hormati, John Keay, Yohann Mogentale, Victor Perrin, John Phillips, Sumathi Raparthy,
Amin Shokrollahi, Fellow, IEEE , David Stauffer, Richard Simpson, Andrew Stewart, Giuseppe Surace,
Omid Talebi Amiri, Emanuele Truffa, Anton Tschank, Roger Ulrich,
Christoph Walter, and Anant Singh
Abstract— An energy-efficient (1.02 pJ/b) and high-speed
(20.83 Gb/s/wire, 417 Gb/s/mm) link for ultra-short reach (USR)
applications (up to 6-dB channel loss at the Nyquist frequency
of 12.5 GHz) is presented. Correlated non-return to zero (CNRZ)
signaling with low sensitivity to inter-symbol interference (ISI)
has been developed to improve the link budget. In addition to high
pin efficiency (5b6w: 5 bits over 6 wires), the proposed signaling
method provides very good resistance against common-mode and
crosstalk noise sources, allowing for dense routing. A very wide-
band (1.3 GHz) jitter tracking mechanism has been employed to
reduce the sensitivity of the system to random and deterministic
jitter and relax design constraints on transmitter. A slicer with
low kick-back noise and a circuit topology well matched to the
continuous-time linear equalizer (CTLE) has been designed to
provide both high input sensitivity and Process, supply Voltage,
and Temperature (PVT) variations tolerance. The link operates
with more than 22-ps (42.5% UI) eye opening at BER =
1E-15. Calibration loops are running in background for quadra-
ture mismatch error correction, clock and data alignment (CDA),
and offset removal.
Index Terms— Clock forwarding, correlated non-return to
zero (CNRZ), CNRZ-5, correlated NRZ, energy efficiency, inter-
symbol interference (ISI) ratio, ISI sensitivity, ISI, multi-chip
module (MCM), multi-wire signaling, NRZ, orthogonal multi-
wire signaling, pin efficiency, SerDes, transceiver, ultra-short
reach (USR), wideband PLL, wireline.
I. I NTRODUCTION
H
IGH-SPEED and low-power data movement is one of
the most crucial problems in high-performance com-
puting (HPC) systems. The performance of many advanced
Manuscript received August 22, 2019; revised October 31, 2019 and
December 16, 2019; accepted December 16, 2019. This article was approved
by Associate Editor Brian Ginsburg. This work was supported by Kandou Bus.
(Corresponding author: Armin Tajalli.)
Armin Tajalli was with Kandou Bus, 1015 Lausanne, Switzerland. He is now
with the Electrical and Computer Engineering Department, The University of
Utah, Salt Lake City, UT 84112 USA (e-mail: armin.tajalli@utah.edu).
Mani Bastani Parizi, Dario Albino Carnelli, Chen Cao, Davide Gorret, Amit
Gupta, Christopher Hall, Ahmed Hassanin, Klaas L. Hofstra, Brian Holden,
Ali Hormati, John Keay, Yohann Mogentale, Victor Perrin, John Phillips,
Sumathi Raparthy, Amin Shokrollahi, David Stauffer, Richard Simpson,
Andrew Stewart, Giuseppe Surace, Omid Talebi Amiri, Emanuele Truffa,
Anton Tschank, Roger Ulrich, Christoph Walter, and Anant Singh are with
Kandou Bus, 1015 Lausanne, Switzerland, and also with Kandou Bus,
Northampton, U.K.
Kiarash Gharibdoust is with EM Microelectronics, Marin, Switzerland.
Color versions of one or more of the figures in this article are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/JSSC.2019.2962655
Fig. 1. Advanced MCM structure, in which USR link is used to move data.
applications, such as Machine Learning (ML), Artificial Intel-
ligence (AI), and autonomous vehicles, depend on the effi-
ciency and speed of communication among different units in
a heterogeneous computing system [1].
Recently, multi-chip module (MCM) architecture has been
exploited to simultaneously improve yield and reduce the
overall cost. Depicted in Fig. 1, MCM technology enables
integrating multiple dies fabricated in various process tech-
nologies with different functionalities on a common substrate.
In addition to energy efficiency and performance, yield and its
associated cost implications are becoming more of a concern
for large-size chips that can be mitigated using MCM tech-
nology. High-speed data movement inside advanced modular
multi-die package and system-in-package (SiP) applications is
a key enabling technology to keep pace with Moore’s law [2]
and substantially improve speed and energy efficiency of the
next-generation HPC systems.
Currently, there is significant research focused on increas-
ing the communication capacity for extremely short reach
(XSR) and ultra-short reach (USR) applications. To improve
the data transfer bandwidth (BW), TSMC has introduced
chip-on-wafer-on-substrate (CoWos) technology, transferring
8 Gb/s/wire with 0.56-pJ/b consumption over 500-μm chan-
nels [3]. In addition to a low level of dissipation, this link
achieves a very high BW density (1.6 Tb/s/mm) using 40-μm
bump pitch. As another example, a multi-chip architecture has
0018-9200 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.