Proximal Alternating-Direction-Method-of- Multipliers-Incorporated Nonnegative Latent Factor Analysis Fanghui Bi, Xin Luo, Senior Member, IEEE, Bo Shen, Senior Member, IEEE, Hongli Dong, Senior Member, IEEE, and Zidong Wang, Fellow, IEEE Abstract—High-dimensional and incomplete (HDI) data sub- ject to the nonnegativity constraints are commonly encountered in a big data-related application concerning the interactions among numerous nodes. A nonnegative latent factor analysis (NLFA) model can perform representation learning to HDI data effi- ciently. However, existing NLFA models suffer from either slow convergence rate or representation accuracy loss. To address this issue, this paper proposes a proximal alternating-direction- method-of-multipliers-based nonnegative latent factor analysis (PAN) model with two-fold ideas: 1) adopting the principle of alternating-direction-method-of-multipliers to implement an effi- cient learning scheme for fast convergence and high computa- tional efficiency; and 2) incorporating the proximal regulariza- tion into the learning scheme to suppress the optimization fluctu- ation for high representation learning accuracy to HDI data. The- oretical studies verify that PAN converges to a Karush-Kuhn- Tucker (KKT) stationary point of its nonnegativity-constrained learning objective with its learning scheme. Experimental results on eight HDI matrices from real applications demonstrate that the proposed PAN model outperforms several state-of-the-art models in both estimation accuracy for missing data of an HDI matrix and computational efficiency. Index Terms—Data science, high-dimensional and incomplete data, knowledge acquisition, industrial application, nonnegative latent fac- tor analysis (NLFA), proximal alternating direction method of multi- pliers, representation learning. I. Introduction I NTERACTION data among numerous nodes are fre- quently seen in many big data-related applications like a recommender system [1]−[3], a blockchain network [4]−[5], and a social network [6]. They are always highly incomplete due to the impossibility of achieving the full interaction map- ping among explosively increasing number of nodes. Conse- quently, a high-dimensional and incomplete (HDI) matrix is commonly adopted to model such interaction data [1]−[5]. Despite its high incompleteness, an HDI matrix contains a huge amount of valuable knowledge in various forms, e.g., potential communities [4]−[6] and user preferences [1]−[3], [7]−[9]. An efficient representation learning model to an HDI matrix is desired for acquiring such knowledge. Great efforts have been paid to this issue, resulting in vari- ous sophisticated representation learning models to HDI data [10]−[15]. Among them, a nonnegative latent factor analysis (NLFA) model [3], [14] is becoming increasingly popular because of: 1) the inherent nonnegativity of most industrial data [3], [14]−[16]; and 2) its high efficiency in performing representation leaning on HDI data. Compared with conven- tional nonnegative matrix factorization (NMF)-based models [17]−[21], an NLFA model enjoys its data density-oriented modeling mechanism (i.e., defining its generalized loss and regularization term on the unbalanced known data of a target HDI matrix rather than on the matrix itself), thereby achiev- ing high computational and storage efficiency. For instance, the Yahoo music matrix [22] contains 76 344 627 instances by 200 000 users on 136 736 songs, where the known data den- sity is 0.28% only, while its size comes to 27.35 billion. When performing representation learning to such an HDI matrix, an NLFA model has proven to be much more efficient than NMF [3], [14], [17], [18]. Existing NLFA models mostly rely on a single latent factor- dependent, nonnegative and multiplicative update (SLF- NMU) algorithm. Similar to a nonnegative multiplicative update (NMU) algorithm [17]−[20] designed for an NMF model, an SLF-NMU algorithm works by making the learn- ing rate in an additive gradient descent algorithm adaptive to cancel the negative terms in the parameter update rules, thereby ensuring the nonnegativity of each desired latent fac- tor if they are initially nonnegative [3], [14]−[17]. It greatly Manuscript received December 21, 2022; revised January 15, 2023; accepted February 2, 2023. This work was supported by the National Natural Science Foundation of China (62272078, U21A2019), the Hainan Province Science and Technology Special Fund of China (ZDYF2022SHFZ105), and the CAAI-Huawei MindSpore Open Fund (CAAIXSJLJJ-2021-035A). Recommended by Associate Editor Shangce Gao. (Corresponding author: Xin Luo.) Citation: F. H. Bi, X. Luo, B. Shen, H. L. Dong, and Z. D. Wang, “Proximal alternating-direction-method-of-multipliers-incorporated nonne- gative latent factor analysis,” IEEE/CAA J. Autom. Sinica, vol. 10, no. 6, pp. 1388–1406, Jun. 2023. F. H. Bi is with the Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, and also with the Chongqing School, University of Chinese Academy of Sciences, Chongqing 400714, China (e-mail: bifanghui@cigit.ac.cn). X. Luo is with the College of Computer and Information Science, Southwest University, Chongqing 400715, China (e-mail: luoxin@swu. edu.cn). B. Shen is with the College of Information Science and Technology, Donghua University, Shanghai 201620, China (e-mail: bo.shen@dhu.edu.cn). H. L. Dong is with the Artificial Intelligence Energy Research Institute, Northeast Petroleum University, Daqing 163318, China (e-mail: hongli.dong@ nepu.edu.cn). Z. D. Wang is with the Department of Computer Science, Brunel University London, Uxbridge UB8 3PH, United Kingdom (e-mail: zidong. wang@brunel.ac.uk). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JAS.2023.123474 1388 IEEE/CAA JOURNAL OF AUTOMATICA SINICA, VOL. 10, NO. 6, JUNE 2023