A 772 Mbit/s8.81 bit/nJ 90 nm CMOS Soft-Input Soft-Output Sphere Decoder Filippo Borlenghi * , Ernst Martin Witte * , Gerd Ascheid * , Heinrich Meyr *† , Andreas Burg ‡ * Institute for Communication Technologies and Embedded Systems, RWTH Aachen University, 52056 Aachen, Germany email: {borlenghi,witte,ascheid,meyr}@ice.rwth-aachen.de † Visiting Professor at the Integrated Systems Laboratory 1, EPFL, 1015 Lausanne, Switzerland ‡ Telecommunications Circuits Laboratory, EPFL, 1015 Lausanne, Switzerland email: andreas.burg@epfl.ch Abstract—Multiple-input multiple-output (MIMO) wireless transmission can approach its full potential in terms of spectral efficiency only with iterative decoding, i.e., by exchanging soft in- formation between the MIMO detector and the channel decoder. Solving the soft-input soft-output (SISO) MIMO detection prob- lem entails a very high complexity, which can typically be reduced only at the cost of a communication-performance penalty. The single tree-search (STS) sphere-decoding (SD) algorithm covers a wide range of this complexity-performance tradeoff. In this paper, we describe the silicon implementation of SISO STS SD. The 90 nm CMOS ASIC operates at a lower signal-to-noise ratio than other MIMO detectors. The maximum throughput is 772 Mbit/s at an energy efficiency of 8.81 bit/nJ. I. I NTRODUCTION Multiple-input multiple-output (MIMO) transmission can significantly increase the data rate in wireless communication systems by spatial multiplexing, without additional usage of limited resources such as bandwidth and transmit power. Unfortunately, in terms of digital baseband processing in the receiver, MIMO also considerably increases the complexity of the detector. Therefore, most circuit implementations accept a sub-optimal communication performance to reduce complex- ity. Linear detectors, based on zero forcing or minimum mean square error (MMSE) criteria, and successive interference cancellation exhibit low complexity but also poor error-rate performance. Maximum-likelihood performance is approached by hard-output sphere decoders. A further performance gain over hard-output methods is achieved, with additional com- plexity, by providing soft information, as log-likelihood ratios (LLRs), to the channel decoder. Iterative MIMO detection and decoding is the final hurdle towards approaching channel capacity [1], [2]. Introducing a feedback loop enables a soft-input soft-output (SISO) detec- tor to improve its estimates based on extrinsic information computed by the channel decoder. Unfortunately, the resulting performance gain comes at the expense of a much higher detection complexity compared with non-iterative schemes. Only recently, the first silicon implementation of a SISO MIMO detector has been presented in [3], based on SISO MMSE parallel interference cancellation (PIC). This algorithm shows considerable communication performance gains over non-iterative detectors, but, like other (quasi-)linear methods, it fails to exploit the spatial diversity provided by MIMO. This limitation is overcome by SISO single tree-search (STS) sphere decoding (SD) [4], which has max-log maximum a posteriori (MAP) performance and the ability to fully ex- ploit spatial diversity. Fig. 1 compares the communication performance, in terms of coded packet error rate (PER), of the non-iterative (iteration number I =1) hard-output SD and the iterative (I ≥ 1) SISO STS SD and SISO MMSE PIC algorithms for two communication scenarios. For a given number of iterations, STS SD always outperforms the MMSE PIC method. In Fig. 1(a) (fast Rayleigh fading channel), the communication-performance gap between the two algorithms ultimately diminishes for I =4 since the strong code takes advantage of the rapidly changing channel conditions. Unfortunately, this type of diversity is typically not available or cannot be exploited by a weaker code. In this case, shown in Fig. 1(b), with I =6, STS SD still reaches the target 1 % PER at a 3 dB lower signal-to-noise ratio (SNR) than MMSE PIC, showing a significantly better robustness to the operating scenario. Moreover, for a given SNR, STS SD typically achieves the target PER with fewer iterations: for instance, with I =2 STS SD already outperforms MMSE PIC at I =6. In addition to adjusting the number of iterations, the complexity of STS SD can be tuned at run-time and traded off with communication performance, hence scaling the detection effort to the target PER and to the SNR operating point. Contributions: We present—to the best of our knowledge— the first silicon implementation of SISO STS SD. Improving the architecture presented in [5], this 90 nm CMOS ASIC demonstrates the scalability of STS SD, achieving at high SNR a maximum throughput of 772 Mbit/s, twice as high as [5] and compatible with recent standards such as IEEE 802.11n, and an energy efficiency of 8.81 bit/nJ. At low SNR this ASIC provides, at a reduced throughput, a communication performance gain and a better robustness to channel conditions than other state-of-the-art detectors. II. MIMO DETECTION BY SISO STS SD A spatial-multiplexing MIMO system with M T transmit and M R ≥ M T receive antennas is assumed [1]. The trans- mitter sends a symbol vector s =[s 1 , ..., s MT ] T ∈O MT , where each s i (i =1 .. M T ) is obtained by mapping Q bits x i,b ∈{+1, -1} (b =1 .. Q) to an element of the complex- valued constellation O. The received signal is given by the complex symbol vector y = Hs + n, where H ∈ C MR×MT