Fast Inverse Square Root Based Matrix Inverse For MIMO-LTE Systems
Chinmaya Mahapatra, Saad Mahboob,
Victor C.M. Leung
Dept. of Electrical and Computer Engineering,
University of British Columbia, Vancouver, Canada
Chinmaya@ece.ubc.ca, smahboob@ece.ubc.ca
Thanos Stouraitis
Dept. of Electrical and Computer Engineering,
University of Patras, Rio, Greece
thanos@ upatras.gr
Abstract—This paper addresses the designing of a low
complexity and high speed matrix inversion algorithm using
fast inverse square root based on QR-decomposition and systolic
array architecture. Matrix operations are the most costly
computational module within MIMO-LTE receivers . We
have demonstrated a novel approach of matrix inverse to reduce
the MIMO receiver module cost in terms of latency and
complexity. The cost is reduced by implementing a 4x4 matrix
inverse in Xilinx Virtex-6 FPGA by optimizing the module for
speed and power by pipelining and achieving a better
throughput. The results are compared with state of art
techniques of CORDIC based squared givens rotation.
Keywords-MIMO LTE; Fast inverse square root; QR
decomposition; Systolic array; Xilinx virtex6 FPGA; Pipelining,
CORDIC
I. INTRODUCTION
Multi Input Multi Output (MIMO) -Long Term
Evolution (LTE) [1], [2] is the one of new technologies in
wireless communications to improve bandwidth utilization
efficiency. The access mode of multi-user MIMO LTE
using a popular digital schemes Orthogonal Frequency
Division Multiple Access (OFDMA) for downlink and
Sub-Carrier Frequency Division Multiple Access (SC-
FDMA) for uplink which provides high data rate in
wireless environments. Multiple access channels are
achieved in OFDMA by assigning narrow sub-bands, each
narrow sub-band has flat frequency response and
frequency selective channel is converted into a lot of flat-
fading sub-channels. This can achieve a higher MIMO
spectral efficiency averaging interferences from
neighboring cells and less affected to various kinds of
impulse noise.
Most of the channel estimation process needs to invert a
matrix which is either the channel state information or a
nonlinear function of it. Increasing the number of transmitter
and receiver antennas provides a higher data rate but the
dimension of matrix function increases. Thus we require fast
approaches to obtain matrix inverse. In this paper, we will be
presenting a matrix inversion technique using fast inverse
square root based givens rotation and will optimize it for
speed and power.
The sections are organized as follows: Section II gives a
brief overview of various matrix-inversion algorithms along
with their demerits. Section III provides a brief description of
the Matrix inversion approach using fast inverse square root
based givens rotation, QR decomposition and systolic array.
Section IV describes FPGA implementation and analysis.
Section V outlines the error analysis. Finally, section VI
concludes the project followed by references.
II. MATRIX INVERSION ALGORITHMS
Methods for computing matrix inversion can be divided
into two categories: iterative and direct. Iterative methods
require an initial estimate of the solution and subsequent
updates based on calculation of the previous estimate error.
Normally, these iterative methods involve high-complexity
sequential matrix computations and are not particularly
suitable for real-time implementation. QRD is an attractive
approach for matrix inversion due to its well known numerical
stability [3]. Several algorithms and architectures have been
proposed for the computation of QRD-based matrix inversion;
those which employ the Gram-Schmidt [4] and conventional
Givens rotations (CGR) algorithms are disadvantaged from an
implementation perspective as they require high-complexity
square-root operations. Whilst the shift-and-add processing
nature of CORDIC-based matrix inversion [5] offers low
complexity hardware implementation, its inherent latency can
preclude it from high-performance applications. Squared
Givens rotations (SGR) offer square-root free processing and a
number of SGR-based matrix inversion architectures have
been proposed [6], [7], [8]. We propose an approach explained
in sections below that replaces the square root and division
operation in matrix inverse by shift and multiply operations.
Thus it reduces latency and increases speed as compared to
other architectures.
III. MATRIX INVERSION USING QR DECOMPOSITION AND
SYSTOLIC ARRAY
In this paper we present the results for inverting a matrix
of size 4× 4. The same idea and a slight modification in
hardware can be used for larger matrix sizes. In the hardware
design, we are using QR decomposition and systolic arrays
[7].
Α=QR (1)
Let A be n× p matrix of full rank p. The QR
decomposition is decomposing matrix A to a triangular matrix
Rp× p and an orthogonal matrix Q using plane rotations.
This work was supported by the Canadian Natural Sciences and
Engineering Research Council through grant STPGP 396756
2012 International Conference on Control Engineering and Communication Technology
978-0-7695-4881-4/12 $26.00 © 2012 IEEE
DOI 10.1109/ICCECT.2012.253
321
2012 International Conference on Control Engineering and Communication Technology
978-0-7695-4881-4/12 $26.00 © 2012 IEEE
DOI 10.1109/ICCECT.2012.253
321