Collaborative Filtering through SVD-Based and Hierarchical Nonlinear PCA Manolis G. Vozalis, Angelos Markos, and Konstantinos G. Margaritis Department of Applied Informatics, University of Macedonia, 156 Egnatia Street, P.O. Box 1591, 54006, Thessaloniki, Greece {mans,amarkos,kmarg}@uom.gr Abstract. In this paper, we describe and compare two distinct algo- rithms aiming at the low-rank approximation of a user-item ratings matrix in the context of Collaborative Filtering (CF). The first one imple- ments standard Principal Component Analysis (PCA) of an association matrix formed from the original data. The second algorithm is based on h-NLPCA, a nonlinear generalization of standard PCA, which utilizes an autoassociative network, and constrains the nonlinear components to have the same hierarchical order as the linear components in stan- dard PCA. We examine the impact of the aforementioned approaches on the quality of the generated predictions through a series of experi- ments. Experimental results show that the latter approach outperforms the standard PCA approach for most values of the retained dimensions. Keywords: Collaborative Filtering,Low-rank Approximation, Artificial Neural Networks, Principal Component Analysis. 1 Introduction With the term Collaborative Filtering (CF) we refer to intelligent techniques which are employed by Recommender Systems (RSs) and are used to generate personalized recommendations. The basic idea of CF is that users who have agreed in the past tend to agree in the future. A common and successful approach to collaborative prediction is to fit a factor model to the original rating data, and use it in order to make further predictions. A factor model approximates the observed user preferences in a low dimensionality space in order to uncover latent features that explain user preferences. In this paper, we will focus on two PCA implementations, aiming at the low-rank approximation of the corresponding user-item ratings matrix. PCA is a well-established data analysis technique that relies on a simple transformation of recorded observations, to produce statistically independent score variables. It has been extensively used for lossy data compression, feature extraction, data visualization, and most recently in the field of Collaborative Filtering [1,2,3]. The linear assumption underlying PCA makes it insufficient for capturing nonlinear patterns among variables. Artificial Neural Network (ANN) models, a class of nonlinear empirical modeling methods, allow for nonlinear K. Diamantaras, W. Duch, L.S. Iliadis (Eds.): ICANN 2010, Part I, LNCS 6352, pp. 395–400, 2010. c Springer-Verlag Berlin Heidelberg 2010