Available Online at www.ijcsmc.com
International Journal of Computer Science and Mobile Computing
A Monthly Journal of Computer Science and Information Technology
ISSN 2320–088X
IJCSMC, Vol. 2, Issue. 4, April 2013, pg.146 – 154
RESEARCH ARTICLE
© 2013, IJCSMC All Rights Reserved 146
Hardware-Optimized Lattice Reduction Algorithm
for WiMax/LTE MIMO Detection using VLSI
R.Ragumadhavan
1
1
Assistant Professor, Department of Electronics and Communication Engineering, PSNA College of
Engineering and Technology, Dindigul, Tamilnadu, India
1
raguece85@gmail.com
Abstract— This paper presents the first ASIC implementation of an LR algorithm which achieves ML
diversity. The VLSI implementation is based on a novel hardware-optimized LLL algorithm that has 70%
lower complexity than the traditional complex LLL algorithm. This reduction is achieved by replacing all the
computationally intensive CLLL operations (multiplication, division and square root) with low-complexity
additions and comparisons. The VLSI implementation uses a pipelined architecture that produces an LR-
reduced matrix every 40 cycles, which is a 60% reduction compared to current implementations. The
proposed design was synthesized in both 130m and 65nm CMOS resulting in clock speeds of 332MHz and
833MHz, respectively. The 65nm result is a 4X improvement over the fastest LR implementation to date. The
proposed LR implementation is able to sustain a throughput of 2Gbps, thus achieving the high data rates
required by future standards such as IEEE 802.16m (WiMAX) and LTE-Advanced.
Key Terms: - WiMax; MIMO; Lattice; LTE
I. INTRODUCTION
Recently, lattice-reduction (LR) has been proposed in conjunction with MIMO detection schemes to improve
their performance via transforming the system model into an equivalent one with a more orthogonal channel
matrix, thereby lowering the likelihood of detection errors due to noise perturbations [1]. The LLL algorithm
(due to Lenstra, Lenstra and Lovasz) [2] is the most commonly used LR method and has been shown to achieve
ML diversity for low- complexity detectors [3] and significantly improve the performance of more complex
detectors such as K-Best [4]. A more efficient, complex-valued extension to LLL (known as CLLL) was
developed in [5]. However, the VLSI implementation of CLLL remains problematic due to its computationally
intensive operations and its non- deterministic complexity.
Currently, only small number VLSI implementations of LR have been reported in the literature, such as [6],
[7] and [8]. Each of these designs was implemented on an FPGA platform. The Clarkson algorithm (CA),
presented in [8], is a variant of CLLL that achieves a lower complexity by modifying the CLLL reduction
criterion. However, CA, like CLLL, has the drawback of variable complexity and it also relies on
computationally intensive operations such as division and multiplication. Another complex LR algorithm known
as Seysen’s algorithm (SA) was presented in [9], however, we show that SA has a much higher computational
complexity than both CA and CLLL. Thus, SA is even more problematic from an implementation point of view.
Therefore to achieve an efficient and high-throughput VLSI implementation of LR, there is a need for an
algorithm with significantly reduced and deterministic complexity.
In this paper we propose the design and ASIC implementation of a modified CLLL algorithm which
achieves a 70% reduction in complexity over existing LR algorithms (including CLLL [5], CA [8], and SA [9])
with effectively the same BER performance. Our algorithm, which we named HOLLL (Hardware-Optimized
LLL), eliminates the need for all computationally intensive LLL operations (such as division and multiplication)