VLSI Implementation of the List Sphere Algorithm M. Wenk, A. Burg, M. Zellweger, C. Studer, and W. Fichtner Integrated Systems Laboratory Swiss Federal Institute of Technology (ETHZ) Zurich, Switzerland Email: {mawenk, apburg, studer, fw}@iis.ee.ethz.ch Abstract- Sphere decoding (SD) is widely considered as one of the most promising detection schemes for multiple-input multiple-output (MIMO) communication systems. The recently proposed list sphere-decoding (LSD) algorithm is an extension of the original SD algorithm that improves the error rate performance of wireless communication systems considerably by providing soft-outputs instead of binary decisions. This paper addresses the VLSI implementation of the LSD algorithm. To this end, algorithm optimizations suitable for efficient hardware implementations are developed. The implemented circuits achieve a gain of up to 3 dB in SNR compared to hard output SDs and a throughput of up to 272 Mbps at 20 dB SNR in a 0.25 ,um technology for 4x4 MIMO systems with 16-QAM modulation. I. INTRODUCTION The evolution of wireless communication systems is driven by the demand for higher system capacity, higher peak throughput, and better quality of service. Multiple-input multiple-output (MIMO) systems [1] can meet these demands by employing multiple antennas on both sides of the wireless link to transmit multiple data streams concurrently in the same frequency band (spatial multiplexing). Hence, many upcoming standards such as IEEE 802.1 In and IEEE 802.16e have been designed to take advantage of MIMO technology. Unfortunately, the use of spatial multiplexing is also associated with a significantly more complex signal processing compared to single-input single-output (SISO) systems. A considerable share of the additional complexity is in the MIMO detector which separates the interfering data streams at the receiver. Hard-decision MIMO detectors deliver binary estimates of the transmitted data, while soft-decision detectors provide log-likelihood ratios (LLRs) derived from the a-posteriori probabilities (APPs) of the transmitted bits. This additional information, which must be obtained at the cost of even higher computational complexity compared to a hard-decision decoder, can be used by the subsequent channel decoder to considerably improve the error rate performance of the communication system. For hard-decision MIMO detection the SD algorithm [2], [3] provides optimum vector error rate performance and can be implemented efficiently in VLSI [4], [5] to achieve very high throughput at high spectral efficiency. The problem of implementing the more complex soft-output MIMO detection algorithms has only been addressed by few publications: For example, in [6] a low-complexity linear MIMO detector with soft-decision output is described which achieves high throughput but suffers from a bit error rate (BER) performance degradation. On the other extreme, the design described in [7] is an ML-APP detector which provides the best possible error rate performance but is limited to a spectral efficiency of up to 8 bits per channel use (e.g., 4x4 with QPSK modulation). A promising scheme that efficiently mitigates the impact of the exponential increase in complexity of the ML-APP algorithm while still providing close-to ML-APP BER performance is the list sphere-decoder (LSD) introduced in [8]. An architectural concept for the VLSI implementation of this scheme has been proposed in [9]. However, the publication does not provide details of the hardware architecture and the predicted throughput of the presented design is below the requirements of some relevant wideband communication systems, such as IEEE 802.11n. Contribution: This paper describes two novel VLSI archi- tectures for high-throughput list-sphere decoding. The first approach is an implementation of the original LSD algorithm [8] based on the hard-decision SD architecture presented in [5]. The second approach introduces some implementation- driven changes to the algorithm which result in a better performance (both in terms of BER and throughput) at the cost of a slightly larger silicon area. Figures of merit are provided for both approaches. Outline: The remainder of this section introduces the sys- tem model and summarizes the hard-decision SD and the ML- APP algorithm which constitute the basis for the introduction of the LSD algorithm provided in Section II. Section III describes the first hardware architecture for the LSD scheme. The improved LSD algorithm and the corresponding VLSI architecture are presented in Section IV. Section V summarizes the implementation results. A. System Model Consider a MIMO system with MT transmit and MR re- ceive antennas. The equivalent baseband model of the MIMO channel between transmitter and receiver is described by an MR x MT-dimensional complex-valued matrix H. The input- output relation of the MIMO system is given by y = Hs + n, (1) where y is the MR-dimensional received vector, s is the MT- dimensional transmitted signal vector and n denotes the MR- dimensional i.i.d. complex Gaussian noise vector with variance No per complex-valued dimension. For spatial multiplexing, the entries of s are chosen independently from a set 0 of 1-4244-0772-9/06/$20.00 ©2006 IEEE 107