This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1
Low-Power Correlation for IEEE 802.16 OFDM
Synchronization on FPGA
Thinh H. Pham, Suhaib A. Fahmy, and Ian Vince McLoughlin
Abstract— This brief compares the use of multiplierless and DSP
slice-based cross-correlation for IEEE 802.16d orthogonal frequency
division multiplexing (OFDM) timing synchronization on Xilinx Virtex-
6 and Spartan-6 field programmable gate arrays (FPGAs). The natural
approach, given the availability of embedded DSP blocks on these FPGAs,
would be to implement standard multiplier-based cross-correlation.
However, this can consume a significant number of DSP blocks, which
may not fit on low-power devices. Hence, we compare a DSP48E1 slice-
based design to four different quantizations of multiplierless correlation
in terms of resource utilization and power consumption. OFDM timing
synchronization accuracy is evaluated for each system at different signal-
to-noise ratios. Results show that even relatively coarse multiplierless
coefficient quantization can yield accurate timing synchronization, and
does so at high clock speeds. Multiplierless designs enjoy reduced power
consumption over the DSP48E1 Slice-based design, and can be used where
DSP Slice resources are insufficient, such as on low-power FPGA devices.
Index Terms—Correlation, cognitive radio, field-programmable gate
arrays (FPGA), IEEE 802.16 standards, orthogonal frequency division
multiplexing (OFDM).
I. I NTRODUCTION
Orthogonal frequency division multiplexing (OFDM) is an effec-
tive modulation technique used in both wired and wireless commu-
nication systems. Particularly, thanks to the advantages of spectral
efficiency and robustness to multipath fading, OFDM was specified
for multiple applications in high bit-rate wireless transmission sys-
tems such as wireless local area networks adopted by IEEE 802.11
and metropolitan area networks in IEEE 802.16d. However, OFDM
performance is sensitive to receiver synchronization. Frequency offset
causes inter-subcarrier interference, and errors in timing synchroniza-
tion can lead to inter-symbol interference [1]. Therefore, synchroniza-
tion is critical for good performance in OFDM systems.
Much research has focused on improving OFDM synchronization
performance and accuracy. Cyclic prefix (CP)-based methods were
introduced [2]–[4] to determine frequency offset and symbol timing,
but do not themselves find the start of a frame. To assist this, all
OFDM frames begin with preamble symbols which can also be used
to estimate the frequency offset [5]. This relies on the characteristic
of a preamble symbol with two identical halves, using autocorrelation
of the received signal, which can be computed iteratively at low cost
and is robust to frequency offset. However, the metric used results
in a plateau which leads to some uncertainty in determining the
start of a frame. Work in [6]–[9] introduced modified timing metrics
based on autocorrelation and the characteristic of specific preamble
symbols to reduce the ambiguity of the plateau in finding the start of
frame. However, the resulting autocorrelation operation is sensitive
to additive white Gaussian noise (AWGN) and frequency selectivity.
Kishore and Reddy [10] presented an algorithm that requires
knowledge of the time domain preamble in the receiver to compute
a cross-correlation metric between the known and received preamble
Manuscript received February 2, 2012; revised June 6, 2012; accepted
July 21, 2012.
T. H. Pham is with Nanyang Technological University, 639798 Singapore,
and also with the TUM-CREATE Centre for Electromobility, 138649 Singa-
pore (e-mail: hung3@e.ntu.edu.sg).
S. A. Fahmy and I. V. McLoughlin are with Nanyang Technological
University, 639798 Singapore (e-mail: sfahmy@ntu.edu.sg; mcloughlin@
ntu.edu.sg).
Digital Object Identifier 10.1109/TVLSI.2012.2210917
symbols. This can accurately determine the start of frame even at
a low signal-to-noise ratio (SNR). However, the cross-correlation
operation requires complex computation. Kim and Park [11] proposed
an accurate synchronization method based upon the preamble symbol
specified in IEEE 802.16d using two separate computation processes:
first, autocorrelation is computed for coarse symbol time offset (STO)
and fractional carrier frequency offset (CFO) estimation to obtain
more reliable frequency synchronization and to reduce hardware
cost; second, the fine STO and the integer CFO are estimated
by performing cross-correlation between the received samples and
known preamble.
Autocorrelation-based techniques are preferred for implementation
on FPGA because of their lower hardware costs. Dick and Harris
[12] reported on the FPGA implementation of an OFDM trans-
ceiver. They showed that FPGAs, with their highly parallel archi-
tecture, are suitable for the implementation of OFDM transceivers.
Wang et al. [13] also presented an FPGA implementation of an
OFDM-WLAN synchronizer. In this brief, the timing synchronization
is obtained by double autocorrelation based on short training symbols
that allows a reduction in the hardware cost on FPGA. Fort et al. [14]
compared the performance and complexity of FPGA implementation
of autocorrelation and cross-correlation algorithms. Their results
show that the accuracy of cross-correlation algorithms is better than
that of autocorrelation algorithms. However, the accuracy of cross-
correlation comes at significant hardware cost. Despite proposing a
new cross-correlator implementation presented in [14] to reduce hard-
ware cost compared to a classic cross-correlation approach, it is still
at least five times more complex to implement than autocorrelation,
because of the fact that several multipliers are required.
Cross-correlation between received samples and a known preamble
can achieve highly accurate timing synchronization; however, this
requires significant resources. Multiplierless correlators for timing
synchronization were introduced in [15], designed for IEEE 802.11a
OFDM frames, based on expressing the correlator coefficients as
sums of powers of 2 that only require shift and add operations.
The authors identified a correlator that eliminates the need for
multiplication, requiring only 26 additions/subtractions per output
while maintaining similar synchronization accuracy as a multiplier-
based implementation.
OFDM is one of the main candidate modulation schemes
for cognitive radios, and we believe FPGAs are an ideal plat-
form owing to their flexibility [16]; hence optimizing this
functionality is the key. Modern FPGAs contain various resources
that can be used to implement cross-correlation. This brief presents
the design of several correlators for timing synchronization with
preamble symbols based upon IEEE 802.16d. We compare designs
using specialized digital signal processing (DSP) Slices to a multi-
plierless approach on Xilinx Virtex-6 and Spartan-6 FPGA devices.
Attempting to implement correlation on FPGAs without consider-
ing and designing the underlying architecture results in a highly
inefficient implementation. In this brief, we show optimized FPGA
designs, built to fit the FPGA architecture, and evaluate performance,
timing synchronization accuracy, resource utilization, and power
consumption, to understand whether a multiplier-based mapping is
beneficial when using modern devices.
II. I MPLEMENTATION OF CORRELATORS
The downlink preamble in IEEE 802.16d [17] contains two consec-
utive OFDM symbols, as shown in Fig. 1. The short symbol consists
of four identical 64-sample fragments in time, preceded by a CP. This
is followed by the long symbol which contains two repetitions of a
128-sample fragment and a CP [17].
1063–8210/$31.00 © 2012 IEEE