2162-2337 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/LWC.2017.2761876, IEEE Wireless Communications Letters 1 Fast Converging Weighted Neumann Series Precoding for Massive MIMO Systems Betty Nagy, Maha Elsabrouty and Salwa Elramly Abstract—Neumann Series (NS) expansion-based precoder in massive multiple input multiple output (MIMO) systems suffers from slow convergence. To solve this problem, this letter proposes a weighted NS expansion precoder. The weights are designed to minimize the error between the exact inverse and the weighted NS inverse. The optimal weights are deduced analytically. Moreover, an approximation of these optimal weights is proposed, based on the properties of large Wishart matrices, which saves the re- computation of these weights. The weighted NS precoding pro- vides near optimal performance at only four weighted expanded NS terms and has lower complexity than recently proposed approximate precoders. Index Terms—Massive MIMO, Matrix inversion, Neumann series, Optimization. I. I NTRODUCTION M ASSIVE MIMO is one of the most promising technolo- gies for the 5 th generation (5G) wireless communica- tions systems [1] due to its exceptional array gain. Marzetta in [2] showed that as the number of the antenna elements grows large, the effects of uncorrelated noise, fast fading and intra- cell interference decrease. Thus, linear precoding schemes can achieve near optimal performance [3]. Linear precoders like Zero Forcing (ZF) [3], Regularized Zero Forcing (RZF) [4] and Minimum Mean Square Error (MMSE) [3] require the inversion of the channel Gram matrix of all users. With the large size of the Gram matrix in the case of massive MIMO serving large number of users, matrix inver- sion is an important practical problem that affects the precoder design and performance. A good precoder requires developing a matrix inversion approximation of low complexity as well as good approximation accuracy. There are several approaches to obtain the inverse of a large matrix; the first approach, namely, direct method depends mainly on decomposing the matrix, to be inverted, into a product of simple matrices like Cholesky factorization [5] and QR decomposition. This approach suffers from high complexity and requires special arithmetic units [6]. While the second approach, namely, indirect or iterative methods, treats the inversion problem as a system of linear equations and solves it iteratively. Examples of the second approach are the Gauss-Seidel [7], Conjugate gradient [8] and Symmetric Successive over Relaxation (SSOR) [9]. Although B. Nagy is with the Department of Physics and Applied Mathematics and S. Elramly is with the Department of Electronics and Communications Engineering, Faculty of Engineering, Ain Shams University, Cairo 11571, Egypt (e-mail: {betty.nagy,salwa_elramly}@eng.asu.edu.eg). M. Elsabrouty is with the Department of Electronics and Com- munications Engineering, Egypt-Japan University of Science and Tech- nology (E-JUST), Borg El-Arab 21934, Alexandria, Egypt (e-mail: maha.elsabrouty@ejust.edu.eg). the iterative approach gets more accurate approximation than the direct method approach [5], it suffers from larger delays. The third approach expands the inverse of a matrix into a series of matrix-vector multiplications like Neumann Series (NS) expansion [6] and truncated polynomial expansion (TPE) [4]. The main advantage of the NS expansion is its simple implementation. However, it suffers from slow convergence. This paper aims at speeding up the convergence of the NS expansion-based precoder. Firstly, weights are introduced to the terms of the NS expansion. Secondly, an optimization problem whose objective is to minimize the error between the exact matrix inverse and the weighted NS inverse is presented and solved analytically. Finally, an approximation to the optimal weights is proposed based on the properties of large Wishart matrices. Consequently, the computation of the optimal weights is insensitive to instantaneous channel realizations. Hence, it can be done once during the system setup and the system complexity will be equivalent to that of the conventional NS-based precoding. Simulation results show that the weighted NS precoder at four expanded series terms has very close performance to optimal exact inversion. More- over, the complexity analysis provides a practical condition at which the weighted NS has lower complexity than competitive techniques. By inspecting the practical values in 5G systems, this condition was found to be satisfied most of the time except in the case of large number of users in high mobility. It must be noted that applying and optimizing the coeffi- cients of series expansion terms was applied in [4] and [10]. However, [4] targeted increasing the system throughput and the added complexity in the system could not be resolved to the original TPE. On the contrary, in the approach proposed here the weights target the problem of slow convergence in NS- based precoding. The approximate weights developed using the properties of the large Wishart matrices do not only provide faster convergence than the unweighted case but also come at no extra computational complexity. The work in [10] aimed to simplify the block diagonalization precoding for multiple antenna users. Authors in [10] used another definition of the NS expansion and obtained an approximate expression for the optimal weights. On the other hand, this letter proposes simplified ZF precoder for single antenna users. According to [1], the diagonal-based NS expansion used here is more robust against unknown channel distribution. Moreover, both exact and approximate optimal weights are presented. Notations: Lower-case and upper-case boldface letters de- note vectors and matrices. (.) T , (.) H , (.) , tr (.), (.) 1 and . f present the transpose, conjugate transpose, pseudoinverse, trace, inversion and Frobenius norm respectively, ( n k ) denotes