IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 10, NO. 3, JUNE 2002 279
Architectural Strategies for Low-Power
VLSI Turbo Decoders
Guido Masera, Marco Mazza, Gianluca Piccinini, Fabrizio Viglione, and Maurizio Zamboni
Abstract—The use of “turbo codes” has been proposed for sev-
eral applications, including the development of wireless systems,
where highly reliable transmission is required at very low signal-to-
noise ratios (SNR). The problem of extracting the best coding gains
from these kind of codes has been deeply investigated in the last
years. Also the hardware implementation of turbo codes is a very
challenging topic, mainly due to the iterative nature of the decoding
process, which demands an operating frequency much higher than
the data rate; in the case of wireless applications, the design con-
straints became even more strict due to the low-cost and low-power
requirements.
This paper first presents a new architecture for the decoder core
with improved area and power dissipation properties; then parti-
tioning techniques are proposed to reduce the power consumption
of the decoder memories. It is proven that most of the power is
dissipated by the large RAM units required by the decoder, so the
described technique is very efficient: an average power saving of
70% with an area overhead of 23% has been obtained on a set of
analyzed architectures.
Index Terms—High performance, low-power design, low voltage,
memory, turbo-decoding, partitioning, very large scale integration
(VLSI).
I. INTRODUCTION
C
ONVOLUTIONAL concatenated codes with iterative de-
coding (“Turbo codes” [1]) have been proved as one of the
most powerful solution for high coding gain applications.
A concatenated encoder is composed of two or more recur-
sive and systematic convolutional encoders. Interleaving blocks
are placed among single encoders and work as memories in
which data are read and written in different orders. There are
two primary schemes of connection: parallel concatenated con-
volutional codes (PCCC, the turbo code originally proposed in
[1]) and serially concatenated convolutional codes (SCCC); in
Fig. 1 the general structures of the two schemes are reported,
where and are convolutional encoders, is the inter-
leaver, is the data stream to be encoded and are the
encoded streams. SCCCs have been shown to yield performance
comparable, and in some cases superior, to PCCC turbo codes
[2]. In [3] more schemes of connection are presented.
The decoder is composed of a concatenation of interleavers
and soft decoders [3] (soft-in soft-out, SISO) which produce in-
dexes of reliability (soft information) related to the input and
output symbol streams of each encoder. The whole decoder op-
erates in an iterative way and the decoding process is stopped
when the wished level of reliability is reached.
Manuscript received July 27, 2000; revised April 6, 2001.
The authors are with the Dipartimento di Elettronica, Politecnico di
Torino, Corso Duca degli Abruzzi 24-10129 Torino, Italy (e-mail: guido@
vlsilab01.polito.it; mazza@vlsilab01.polito.it; gianluca@vlsilab01.polito.it;
viglione@vlsilab01.polito.it; maurizio@vlsilab01.polito.it).
Publisher Item Identifier S 1063-8210(02)04168-9.
Fig. 1. Convolutional encoders with interleavers: (a) parallel (PCCC);
(b) serial (SCCC).
Recently, turbo codes have been proposed for wireless com-
munications, such as the universal mobile telecommunication
system (UMTS) [4], [5] for the third generation of mobile com-
munications; other applications of turbo codes are in standard
protocols for disk drivers [6] and in satellite and deep-space
communications [7], such as in the European Space Agency
(ESA) mission Rosetta.
The design of low-cost and low-power turbo decoders is very
important in wireless communication systems. Several papers
have been published on the subject of low-power turbo decoder
implementation. In [8], [9], and [10], different stop criteria
are proposed to stop the decoding iterations through online
monitoring of some different blends of performance related
quantities.
In [11], data flow transformations are described as a method
to reduce both the size and the number of transfers of storage
blocks at the cost of some extra calculations. With the target of
more effectively using the memories, algorithm transformations
have been analyzed. That work obtained a power saving of about
60%, a speed-up of 70% with an area overhead of 20%, although
these results are only referred to the SISO units and not to the
entire architecture of the decoder. In [12] and [13] analog imple-
mentations of turbo decoders are proposed as low-power solu-
tions. However, a systematic study about optimization of power
dissipation of the entire decoder has not been performed yet.
In this paper, an analysis of the distribution of the power con-
sumption among the constitutive blocks of the decoder is pre-
sented: as a result of this study, it is shown that memory blocks
are the most critical units of the decoder in terms of power con-
sumption. The paper also proposes two architectural solutions
for low-power implementation: the first solution is a new archi-
tecture for the SISO unit, which roughly offers a 20% reduction
1063-8210/02$17.00 © 2002 IEEE