RECONFIGURABLE DECODER ARCHITECTURES FOR RAPTOR CODES
Hady Zeineddine and Mohammad M. Mansour
ECE Department
American University of Beirut
Beirut, Lebanon
Email: {hma41,mmansour}@aub.edu.lb
ABSTRACT
Decoder architectures for architecture-aware Raptor codes having
regular message access-and-processing patterns are presented. Rap-
tor codes are a class of concatenated codes composed of a fixed-
rate precode and a Luby-Transform (LT) code that can be used
as rate-less error-correcting codes over communication channels.
In the proposed approach, the decoding procedure is mapped to
row processing of a regular matrix, which adapts effectively to
the code’s randomness and degree-irregularity. This is achieved
by 1) developing reconfigurable check node processors that attain
a constant throughput while processing LT- and LDPC-nodes of
varying degrees and numbers, 2) applying pseudo-random permu-
tation on the communicated messages, and 3) computing bit-to-
check messages in a serial, temporally distributed manner. A serial
decoder for a rate-0.4 code implementing the proposed approach
was synthesized in 65nm CMOS technology. Hardware simulations
show that the decoder achieves a throughput of 22M b/s at BER
of 10
−6
, dissipates an average power of 222mW and occupies
an area of 1.77mm
2
. A range of partially-parallel decoders with
desired throughput can be designed by replicating the processing
nodes of a serial decoder.
I. INTRODUCTION
A Raptor code is constructed by concatenating a fixed-rate precode
to a rateless LT code [1]. Raptor codes were originally designed to
operate on erasure channels, and later extended for correcting errors
over other communication channels [2]. The rate of a Raptor code
is determined on a block-by-block basis or even changed for the
same block, upon a decoding failure, thus, making it advantageous
to utilize over binary-input memoryless symmetric channels.
LT codes can be decoded efficiently using Gallager’s iterative
two-phase message-passing algorithm (TPMP), typically used in
LDPC decoding [3]. In the case of an LDPC precode, the TPMP
algorithm can be applied on the LT-LDPC concatenated code
instead of applying a two-stage decoding. Joint decoding achieves
better coding performance and results in faster convergence, and
enables utilizing the same hardware resources for LT and LDPC
decoding. This motivates the need for a hardware-efficient decoder
architecture for Raptor codes, having an LDPC code as a precode.
The peculiar features of Raptor codes impose serious challenges
on applying the optimizations targeted at hardware-efficient LDPC
decoders to Raptor decoders (e.g. [4]–[7]). These features include:
variable code rate, random LT-encoding, variable check-degree
This work was supported by funds from the University Research Board
at the American University of Beirut.
distribution, and joint decoding of the LT code and LDPC precode.
These irregularity and randomness features lead to low resource
utilization, high control overhead, complex data movement patterns,
in addition to stringent memory requirements, thus resulting in a
highly inefficient implementation.
In [8], a method to construct architecture-aware (AA)-Raptor
codes was proposed. This method embeds compatible structure
into both LT and LDPC codes and decouples code structuring
from random LT encoding. In this paper, a decoder architecture
for this class of AA-Raptor codes is presented. The proposed
approach is to make use of the code structure and the architectural
optimizations, to map the decoding procedure into row processing
of a regular matrix. The decoding schedule hence is made simple,
regular and identical across both LT and LDPC codes. To this
end, the bit-to-check message computation is temporally distributed
so that varying the rate or bit-node degrees changes the number
of cycles per decoding iteration, while leaving the workload per
cycle unchanged. To solve the check-degree variability problem,
a novel reconfigurable check-function unit (CFU), with a constant
throughput, is designed to process LT-nodes whose degrees sum to
a constant p, and LDPC nodes whose degrees are a multiple of p.
The remainder of the paper is organized as follows. Section
II presents the decoding scheduling and the corresponding serial
architecture. The reconfigurable check-node unit design is described
in Section III, and section IV gives hardware simulation results for
the serial decoder implementation.
II. DECODER ARCHITECTURE
The decoding scheduling, and consequently the architecture, is
based on the following three features of the Raptor codes con-
structed in [8]. Throughput the paper, p is assumed to be prime.
Source matrix The LT code is derived from a p × (p − 1) matrix
H0 =[h
ij
] of p×p shifted identity matrices. Each nonzero element
of h
ij
is in turn a p × p shifted identity matrix.
Row-Splitting Every row in H0, having weight p − 1, is split
into several rows. The formed rows are appended to the LT matrix.
LDPC Matrix Each row in the p
2
× p
2
(p − 1) LDPC matrix
Hp has weight/check-degree c(p − 1) and is attained by merging
(or equivalently xoring) c rows of H0.
II-A. Serial Decoder Architecture
The serial decoder processes messages corresponding to one row
of H0 per cycle. Figure 1 illustrates the decoder architecture. For
clarity of exposition, let H
′
R
be a submatrix of H0 composed of M
rows used to generate the LT graph, concatenated with the cp
2
rows
1669 978-1-4577-0539-7/11/$26.00 ©2011 IEEE ICASSP 2011