Prediction of User Opinion for Products A Bag-of-Words and Collaborative Filtering based Approach Esteban Garc´ ıa-Cuesta 1 , Daniel G ´ omez-Vergel 1 , Luis Gracia-Exp ´ osito 1 and Mar´ ıa Vela-P´ erez 2 1 Computer Science Department, Universidad Europea de Madrid, Calle Tajo S/N Villaviciosa de Od´ on 28670, Madrid, Spain 2 Departamento de Estad´ ıstica e Investigaci´ on Operativa II, Facultad de Ciencias Econ´ omicas y Empresariales, Universidad Complutense de Madrid, Campus de Somosaguas, Madrid, Spain {esteban.garcia, daniel.gomez, luis.gracia}@universidadeuropea.es, mvelaper@ucm.es Keywords: User Opinion, Recommendation Systems, User Modeling, Prediction, Hyper-personalization. Abstract: The rapid proliferation of social network services (SNS) gives people the opportunity to express their thoughts, opinions, and tastes on a wide variety of subjects such as movies or commercial items. Most item shopping websites currently provide SNS systems to collect users’ opinions, including rating and text reviews. In this context, user modeling and hyper-personalization of contents reduce information overload and improve both the efficiency of the marketing process and the user’s overall satisfaction. As is well known, users’ behavior is usually subject to sparsity and their preferences remain hidden in a latent subspace. A majority of recommendation systems focus on ranking the items by describing this subspace appropriately but neglect to properly justify why they should be recommended based on the user’s opinion. In this paper, we intend to extract the intrinsic opinion subspace from users’ text reviews –by means of collaborative filtering techniques– in order to capture their tastes and predict their future opinions on items not yet reviewed. We will show how users’ reviews can be predicted by using a set of words related to their opinions. 1 INTRODUCTION The advent of the Internet and its social websites have made it possible for people to express their opin- ions with great ease. This is particularly true in e- commerce web sites –e.g., Amazon– where users may read published opinions to gather a first impression on an item before purchasing it. This information may also be used to design better marketing strate- gies, to hyper-personalize the website, and to im- prove the user’s experience. Recall that by hyper- personalization we doesn’t only mean the process of adaptation to the user’s needs and their characteristics but also to provide some insights about it. In this sense, recommender systems have truly transformed the way users interact and discover prod- ucts on the web. Whenever a user assesses any type of product there exists the need to model how the assess- ment is done to be able to recommend new products they may be interested in (McAuley and Leskovec, 2013a), or to identify users of similar taste (Sharma and Cosley., 2013). To model users and the way they evaluate and review products it becomes neces- sary to unveil the latent structure of their opinions. In (McAuley and Leskovec, 2013b), for instance, the authors present a hidden factor model to under- stand why any two users may agree when reviewing a movie yet disagree when reviewing another: The fact that users may have similar preferences towards one genre, but opposite preferences for another turns out to be of primary importance in this context. Incorporating the latent factors associated with users is, therefore, a fundamental step in any rec- ommendation system (Bennet and Lanning, 2007). Typically, these systems use plain-text reviews and/or numerical scores, along with machine learning al- gorithms, to predict the scores that users will give to items that remain still unreviewed (Y. Koren and Volinsky, 2009). In (McAuley and Leskovec, 2013b) authors also propose the use of these latent factors not for prediction, but to achieve a better understanding of the rating dimensions –to be connected to the intrin- sic features of users and their likes–, hence improving the user modeling process. Our starting hypothesis is that by assuming the existence of a latent space that accurately represents the users’ interests and tastes (see (McAuley and Leskovec, 2013b)) we may be able to predict their opinions/reviews. Rather than using the latent space to predict ratings, we intend therefore to predict the García-Cuesta, E., Gómez-Vergel, D., Expósito, L. and Vela-Pérez, M. Prediction of User Opinion for Products - A Bag-of-Words and Collaborative Filtering based Approach. DOI: 10.5220/0006209602330238 In Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2017), pages 233-238 ISBN: 978-989-758-222-6 Copyright c 2017 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved 233