ASIP design for partially structured LDPC codes L. Dinoi, R. Martini, G. Masera, F. Quaglio and F. Vacca A new class of partially structured LDPC codes is proposed and their performance evaluated. Part of the parity check matrix has a very regular structure that strongly simplifies implementation; a few randomly distributed ones, supported by a programmable ASIP (application specific instruction set processor), improve error correct- ing capabilities. Introduction: Low density parity check (LDPC) [1] codes have been deeply explored in the last few years; however, their practical implementation is still a challenging topic. The main reason for this is the irregular structure of the parity check matrix (H), resulting in scarcely efficient and expensive interconnect architectures among processing elements. To cope with this problem, structured LDPC codes [2] have been proposed: a structure is given to H and ones are properly distributed, with the purpose of limiting interconnects while maintaining good error-correcting capability. In this work we partition non-zero elements into two classes: regular ones are positioned in accordance with a repetitive fixture laid not far from the diagonal, while random ones can be freely placed in the whole matrix. The potential advantage offered by this ‘partially structured’ approach is twofold: first, the modular fixture of regular ones strongly simplifies the communication structure among processing elements; secondly, different codes can be decoded by the same hardware, provided that the regular ones fixture remains the same for all codes and random ones are handled using a flexible unit, able to be easily adapted to changed positions. Proposed codes: We designed a new class of partially structured LDPC codes [3–5]. They can be considered as an extension of structured eIRA (extended irregular repeat-accumulate) codes [5]. As a general rule, highly structured matrices and limited connectivity lead to bad performance; this effect cannot be observed in [2] because regular LDPC codes, with a minimum VN (variable node) degree of three, do not suffer from high error floors. On the contrary, eIRA codes have a large number of degree two VNs and this often results in high error floors. Therefore, we adopt a partial structure, allowing some of the edges to be placed randomly. As an example, in Fig. 1 the parity-check matrix of a rate 1=2 code is shown: three sections can be identified, with high-degree, degree four and degree two VNs. The degree distribution was obtained by means of tools mentioned in [4]. Structured blocks are permuted versions of the identity matrix: permutations used in the blocks labelled as P i (i ¼ 1, 2, ... ) in Fig. 1 are triangular S-random interleavers built according to a tail-biting definition of the spread factor. degree = 2 degree = 4 maximum degree staircase construction I I I I I random edges I I I I I I I ’ 3 ’ 4 ’ 1 ’ 5 ’ 2 ’ 1 ’ 2 ’ 1 ’ 2 ’ 1 ’ 2 ’ 1 ’ 2 ’ 1 ’ 2 ’ 1 ’ 2 ’ 1 ’ 2 ’ 2 ’ 1 ’ 6 ’ 3 ’ 4 ’ 5 ’ 6 ’ 3 ’ 4 ’ 5 ’ 6 Fig. 1 Parity-check matrix structure for code-rate 1=2 To show the potential of the proposed approach, we compared a rate 1=2 code to the slightly shorter code proposed for the IEEE 802.16e standard and to the regular structured code of [2]. The codeword length is approximately equal to 2000 bits and the VN degree distribution is: l ˜ 1 ¼ 0.5000, l ˜ 3 ¼ 0.3750, l ˜ 6 ¼ 0.1250. Percentage distribution of edges among staircase section, regular patterns and random are 29.6, 51.9 and 18.5%, respectively. According to the scheme of Fig. 1, there are nine macro-columns in the central section. If D ¼ n k=12, for the ith (i ¼ 0, 1, ... , D 1) column of the jth ( j ¼ 0, 1, ... , 8) macro-column, the three edges are placed in the following rows: jD þ 2i þ 1 Dmod2 (2 þ j)D þ P 1 (i) (3 þ j)D þ P 2 (i) where the mod operator represents the remainder of the integer division. For each macro-column, the fourth edge is placed according to [6]. Similarly, for each column in the leftmost part of the matrix, five edges are disposed according to a regular pattern and two randomly. The simulation results of Fig. 2 show clearly that our code does not suffer from high error floor, despite its strong structure (indeed it performs better than the one of IEEE 802.16e), and that the conver- gence threshold advantage typical of irregular LDPC codes is preserved, with respect to the regular code of [2]. 0.5 1.0 1.5 2.0 2.5 10 –8 10 –7 10 –6 10 –5 10 –4 10 –3 10 –2 10 –1 10 0 E b /N 0 , dB FER, BER IEEE 802.16e (1728,864), FER IEEE 802.16e (1728,864), BER Our (2040,1020), FER Our (2040,1020), BER [2], BER Fig. 2 Simulation results for code-rate 1=2 Decoder architecture: The usual belief-propagation algorithm was rearranged to separately process messages associated to regular and random ones. Consider the following update rule for VN j: Q i; j ¼ l j þ P i 0 2MðjÞ=i R i 0 j ð1Þ where l j is the received intrinsic information, R ij is the message sent by check node (CN) i to VN j, M( j) is the set of nodes connected to VN j, and Q i,j is the message sent by VN j to CN i. We also define S( j) and R ( j) as the two subsets of nodes connected to VN j and associated, respectively, to regular and random ones. The VN update rule is then rewritten as: Q i; j ¼ l j þ P i 0 2SðjÞ=i R i 0 j þ P i 00 2R ðjÞ=i R i 00 j ¼ l j þ Q S i; j þ Q R i; j ð2Þ The two partial VN to CN messages, Q i, j S and Q i, j R , are independently calculated by dedicated and programmable units, respectively, and then exchanged to obtain final messages delivered to CN nodes. The same method of split evaluation is applied to the generation of CN to VN messages. The high-level view of the decoder architecture is given in Fig. 3, where the programmable and dedicated processing units are connected by means of memories for the exchange of partial messages. Only messages associated to randomly placed ones need to be accessed by the ASIP, while dedicated units only access messages related to the regular part of H. In general-purpose embedded processors executing the LDPC decod- ing algorithm, a large overhead is associated to extensive memory accesses and complex processing in CNs. To overcome these limita- tions, the programmable unit was designed as an ASIP, with specialised instruction set and hardware resources optimised to efficiently handle the required processing and memory accesses. The CoWare LISATeK [7] methodology was adopted in the development of both ELECTRONICS LETTERS 31st August 2006 Vol. 42 No. 18