ALOE-based flexible LDPC decoder Ismael Gomez, Massimo Camatel, Jordi Bracke, Vuk Marojevic, Antoni Gelonch Dept. of Signal Theory and Communications Universitat Politecnica de Catalunya Av. Canal Olímpic s/n 08860 Castelldefels, Spain {ismael.gomez, massimo.camatel, jordi.bracke, marojevic, antoni.gelonch}@tsc.upc.edu Fabrizio Vacca, Guido Masera Departamento di Elettronica Politecnico di Torino Corso Duca degli Abruzzi 24, Torino {fabricio.vacca, guido.masera}@polito.it Abstract—Radio communications terminals and infrastructure tend to support an increasing range of algorithms and radio access technologies. Flexible processing platforms are therefore needed for supporting multi-standard or heterogeneous radios. Channel decoding is one of the most computing demanding digital signal processing blocks of a radio transceiver. At the same time, it provides a high degree of implementation flexibility as well as facilitates dynamic parameter adjustments. This paper presents a flexible LDPC decoder implemented on an FPGA device following the ALOE middleware design paradigm. We analyse the middleware efficiency in terms of flexibility versus resource requirements. The results show a relative middleware area overhead of 32 %. ALOE middleware, flexible LDPC, reconfigurable logic, SDR I. INTRODUCTION The continuous improvements in the micro-electronic technology have made conceivable the integration on the same Integrated Circuit (IC) millions of MOS transistors and logic gates. This makes possible the design of novel integrated architecture with enhanced capabilities. These augmented possibilities require novel design paradigms in order to catch all of them, particularly the resorting to flexible architectures able to easily adapt to different applications and algorithms [1]. This evolution of digital processing architectures in the direction of an increasing level of flexibility is particularly evident in the field of wireless communication systems. Since the number of radio standards is growing very fast and the diversity among the standards is also increasing, there is a need for a processing solution capable of handling as many standards as possible. In particular, the idea of software-defined radio (SDR) implies the implementation, in the future, of flexible multi- standard radios, supporting all these different standards, with no degradation in terms of achievable data rate or transmission reliability. Flexible platforms are necessary to this purpose. In this context, Multi-Processor System-on-Chip (MP-SoC) architectures are being widely investigated these last years in order to accommodate the increasing throughput and flexibility requirements of emerging wireless communication standards. Among the several functionalities specified in wireless communication standards, one of the most demanding operations is channel decoding, which contributes at least 40 % to the total computational complexity of the physical layer of a wireless system. Each new wireless standard typically increases the data rate, while keeping low the occurrence of errors in the transmissions. Moreover, depending on some external conditions, each standard provides different profiles. Thus, an integrated circuit designed for telecommunication purposes has to exploit a certain degree of flexibility in order to tackle all these profiles. More flexible architectures can have also support for future out-coming standards. In this context, the present work proposes and evaluates a new fully flexible solution for the implementation of multi- standard and multi-mode channel iterative decoder supporting generic Low-Density-Parity-Check (LDPC) codes [2]. These codes are able to achieve high performances in terms of bit error rate (BER) although they have very high computing requirements at the receiver side. At present, several applications, such as the digital satellite broadcasting system (DVB-S2), Wireless Local Area Network (IEEE 802.11n) and Metropolitan Area Network (802.16e) incorporated them. In MP-SoC architectures for iterative decoders, several independent data blocks can be simultaneously decoded on different processors. In addition to node computational capabilities, an interconnect structure is necessary to support the iterative message exchange among variable and check nodes. In this context, Network-on-Chip (NoC) has recently emerged as a new paradigm [3] allowing coping with these major design issues, and more particularly with the on-chip interconnection needs. Efficient MP-SoC architectures assume heterogeneous processing elements (PE). Therefore it is necessary to define an optimum mapping of tasks to the set of PE maximizing computation efficiency [4]. Moreover, decoder throughput can be adapted in time balancing the total amount of resources assigned to it. Due to the ability of the decoder algorithm to be parallelized, the more PEs assigned to it, the higher performance.