ISSCC 2003 / SESSION 4 / CLOCK RECOVERY AND BACKPLANE TRANSCEIVERS / PAPER 4.3
4.3 A Second-Order Semi-Digital Clock Recovery
Circuit Based on Injection Locking
M.-J. Edward Lee
1
, William J. Dally
1,2
, John Poulton
1
, Trey Greer
1
,
John Edmondson
1
, Ramin Farjad-Rad
1
, Hiok-Tiaq Ng
1
, Rohit Rathi
1
,
Ramesh Senthinathan
1
1
Velio Communications, Milpitas, CA
2
Stanford University, Stanford, CA
Clock recovery circuits are among the most critical components
in communication systems. A dual-loop architecture, in which
the frequency synthesizer and the clock aligner are separated,
has been used extensively due to the conflicting needs to sup-
press jitter accumulation and filter noisy input [1]. Among dif-
ferent frequency synthesis architectures, a multiplying delay-
locked loop (MDLL) is advantageous when a clean reference
clock is available since the oscillator noise is accumulated only
over one reference clock cycle before being reset by the clean
source [2]. However, this instantaneous correction produces
large cycle-to-cycle jitter and duty-cycle distortion (jointly called
clock distortion hereafter) for the downstream clock and data
paths. The clock aligner, however, requires a low bandwidth for
input jitter filtering. Its control loop is often implemented as a
first order system using digital circuits that are flexible, easy to
implement, and robust against noise. However, a first-order sys-
tem is limited in its ability to filter input jitter and track fre-
quency offset simultaneously. Furthermore, for an infinite phase
range the timing vernier is often implemented using multi-phase
interpolation that is expensive in terms of area and power [1]. In
this paper, a clock recovery circuit that overcomes these limita-
tions is described.
Figure 4.3.1 shows the top-level architecture consisting of a fre-
quency synthesizer, a jitter-filtering timing vernier and a second-
order digital phase controller. To reduce clock distortion, the out-
put of the MDLL is injected into a slave replica oscillator, acting
as a first-order low-pass filter on the phase error. This is shown
on the right of Fig. 4.3.1, where I
c
is the current bias for the delay
element and I
m
is the maximum current bias for the injection
devices. The injection coefficient is defined as I
m
/(I
c
+I
m
) and is
roughly the amount of corrected phase per injection divided by
the phase error between the master and slave oscillators. I
0
is
described in the next paragraph and is assumed to be I
m
for ease
of understanding. Since injection occurs at the multiplied fre-
quency, the error correction bandwidth is made high to suppress
jitter accumulation. Yet, any high-frequency jitter, such as refer-
ence clock injection, is attenuated and spread out over multiple
clock cycles. For example, if the clock frequency is 1GHz and the
injection strength is 1/10, the error correction bandwidth is
about 20MHz and the clock distortion is attenuated by 90%. The
injection strength must be strong enough to cover the lock range
of the slave oscillator over the statistical variation of devices.
Injection locking is also used in the timing vernier to vary the
phase of the slave oscillator with respect to the master oscillator.
In Fig. 4.3.1, I
0
can be decreased (increased) to advance (delay)
the slave oscillator with respect to the master oscillator. When
the strength of B
1
(B
0
) reaches I
m
, B
0
(B
1
) can be inverted to fur-
ther advance (delay) the slave oscillator. A full 360
O
phase adjust-
ment range is achieved. The master and slave oscillators are
identical with slightly different connections to ensure frequency
matching. For example, the master oscillator also contains injec-
tion devices for reference clock injection and both the master and
slave oscillators have identical buffers at the outputs for equal
loading. For clarity, these devices are not shown in Fig. 4.3.1.
A phase control unit accepts binary early and late indications
from the phase detector, performs some filtering, and generates
the appropriate current bias for the timing vernier. Previous
implementations of the phase control unit are first-order in that
they simply count the number of early (late) and delay (advance)
the clock phase when a threshold is reached. This implementa-
tion trades off frequency tolerance with input jitter filtering.
With a counter size of N, the phase lag between the optimal sam-
ple point (where early crosses late) and the clock, assuming uni-
formly distributed jitter, is (∆fxJPN)/(2d), where ∆f is the fre-
quency offset, P is the number of phase steps per unit interval
(UI), d is the edge density, and J is the p-p input jitter. On the
other hand, the counter size also affects the amount of phase
wander due to insufficient input filtering. The phase wander
probability with a uniformly distributed input jitter of 0.5UI
using various phase counter sizes is shown in Fig. 4.3.2, calcu-
lated using a Markov chain. A larger counter size leads to a
smaller phase wander but a larger phase lag. To overcome this
limitation, a frequency control loop is introduced that advances
or delays the clock continuously, as shown in Fig. 4.3.3, where
the frequency of bclk is half of the bit rate. The up and down sig-
nals of the frequency control loop are added to those from the
phase control loop. A frequency generator produces three pulsed
signals whose frequencies divided by P are 122ppm, 61ppm and
30.5ppm of the bit rate. A saturating counter selects these sig-
nals to produce the desired frequency. This circuit allows
240ppm frequency offset but the largest frequency offset the
phase counter sees is 30.5ppm. This enables the utilization of a
larger phase counter to reduce the phase wander without com-
promising the frequency tolerance. The frequency pre-counter
size and phase counter size are programmable for different
applications.
This circuit, implemented in a 0.18μm CMOS technology with a
1.8V supply, is used in several high bandwidth communication
devices containing as many as 140 3.125Gb/s serial I/Os with per
lane clock and data recovery (CDR). The CDR and 1:8 deserial-
izer consume 80mW at 3.125Gb/s in the worst case and occupy
an area of 1mm by 160μm. Figure 4.3.4 shows the jitter tolerance
of the CDR with and without the second-order frequency loop at
2.5Gb/s. The transceiver is running at a 200ppm frequency offset
with a 23b pseudo-random bit sequence. The improvement at low
and high frequencies with the second-order loop is 0.1-0.2UI,
consistent with the phase lag at 200ppm. The fluctuation
between 500kHz and 3MHz is due to the peaking behavior of the
second-order loop. Figure 4.3.5 shows the timing vernier step
sizes over four different lanes. The measurement is taken over a
full 800ps clock cycle (two bit times) that contains 128 steps. The
maximum step size is 22ps and the minimum -5ps. The large
peaks and the negative steps on the plot are due to the binary
encoding of the timing vernier current bias. It has subsequently
been changed to thermometer encoding to reduce this problem.
Figure 4.3.6 shows the jitter transfer from the reference clock to
the output for an MDLL with and without the slave oscillator.
The reference clock frequency is 125MHz and the multiplication
factor is 8. The addition of the slave oscillator creates a 20MHz
pole in the jitter transfer, indicating that the injection strength
is about 1/10. This implies that the clock distortion is reduced by
90%.
References
[1] K.-Y. K. Chang et al., “A 2Gb/s/pin Asymmetric Serial Link,” Proc.
IEEE Symposium of VLSI Circuits, pp. 216-217, June 1998.
[2] R. Farjad-rad, et al., “A 0.2-2GHz 12mW Multiplying DLL for Low-
Jitter Clock Synthesis in Highly-Integrate Data-Communication Chips,”
Digest of Technical Papers, IEEE ISSCC, February 2002.
• 2003 IEEE International Solid-State Circuits Conference 0-7803-7707-9/03/$17.00 ©2003 IEEE