What about Interpreting Features in Matrix Factorization-based Recommender Systems as Users? Marharyta Aleksandrova Université de Lorraine - LORIA, France NTUU “KPI”, Ukraine firstname.lastname@loria.fr Armelle Brun Université de Lorraine - LORIA Campus Scientifique 54506 Vandoeuvre les Nancy, France firstname.lastname@loria.fr Anne Boyer Université de Lorraine - LORIA Campus Scientifique 54506 Vandoeuvre les Nancy, France firstname.lastname@loria.fr Oleg Chertov NTUU “KPI”, 37, Prospect Peremohy, 03056, Kyiv, Ukraine chertov@i.ua ABSTRACT Matrix factorization (MF) is a powerful approach used in recommender systems. One main drawback of MF is the dif- ficulty to interpret the automatically formed features. Fol- lowing the intuition that the relation between users and items can be expressed through a reduced set of users, re- ferred to as representative users, we propose a simple mod- ification of a traditional MF algorithm, that forms a set of features corresponding to these representative users. On one state of the art dataset, we show that the proposed representative users-based non-negative matrix factorization (RU-NMF) discovers interpretable features, while slightly (in some cases insignificantly) decreasing the accuracy. Keywords Recommender systems, matrix factorization, features inter- pretation. 1. INTRODUCTION, RELATED WORKS Recommender systems aim to estimate ratings of target users on previously non-seen items. One of the methods used for this task is matrix factorization (MF), which relies on the idea that there is a small number of latent factors (fea- tures) that underly the interactions between users and items [1]. Let M be the number of users and N the number of items. The interaction between these entities is usually rep- resented under the form of a matrix R with element rmn corresponding to the rating assigned by the user m to the item n. MF techniques decompose the original rating ma- trix R into two low-rank matrices U (dim(U )= K × M) and V (dim(V )= K × N ) in such a way that the product of these matrices approximates the original rating matrix R ≈ R * = U T V . The set of K factors can be seen as a joint latent space on which a mapping of both users and items spaces is performed [1]. Features resulting from factoriza- tion usually do not have any physical sense, what makes resulting recommendations unexplainable. Some works [2, 3] made attempts to interpret them by using non-negative matrix factorization with multiplicative update rules (for simplicity, further referred to as NMF). However, the pro- posed interpretation is not so easy to perform as it has to be discovered manually. Based on the assumption that the preferences between users are correlated, we assume that within the entire set of users, there is a small set of users that have a specific role or have specific preferences. These users can be considered as representative of the entire pop- ulation and we intend to discover features from MF that are associated with these representative users. 2. THE PROPOSED APPROACH: RU-NMF Let us consider 2 linear spaces L1 and L2 of dimensionality respectively 6 and 3, with basic vectors in canonical form {um}, m ∈ 1, 6 and { f k }, k ∈ 1, 3 . Let the transfer matrix from L1 to L2 be specified by matrix (1). Then u5, u1 and u2 are direct preimages of f1, f2 and f3 respectively, indeed, Pu5 = f3. At the same time vectors u3, u4 and u6 will be mapped into linear combinations of basic vectors f1, f2, f3. P = 0 0 p13 p14 1 p16 1 0 p23 p24 0 p26 0 1 p33 p34 0 p36 (1) Matrix U can be considered as a transfer matrix from the space of users to the space of features. Analyzing the ex- ample considered above, we can say that if matrix U has a form similar to (1), i.e. U has exactly K unitary columns