Performance Comparison of Similarity Measures Used in Recommendation Systems Berna Seref 1* and Erkan Bostanci 2 1 Ankara University, Computer Engineering Department, Ankara, Turkey. Email: bseref@ankara.edu.tr 2 Ankara University, Computer Engineering Department, Ankara, Turkey. Email: ebostanci@ankara.edu.tr *Corresponding Author Abstract: There are more data on the web, thus it is hard to get relevant data and make good decisions. Recommendation systems provide suggestions to users about the various items. They are classifed into four groups which are collaborative fltering, content-based fltering, knowledge-based recommender systems, and hybrid recommendation systems. There are some similarity measures such as Pearson Correlation, Euclidean, Uncentred Cosine, and LogLikelihood to calculate similarity between users or items. In this study, a user-based collaborative fltering recommendation system is developed on Eclipse platform using mahout library. To develop a recommendation system, different similarity measures such as Pearson Correlation, Euclidean, Uncentred Cosine, and LogLikelihood are used. After that, recommendation performances of them are compared. Movielens datasets are used to train and test the system. As a result, it is seen that while the best mean average error and the best root mean square error performances belong to Uncentred Cosine similarity measure, the best precision, recall, and f-measure performances belong to Pearson Correlation measurement. Keywords: Euclidean, Loglikelihood, Pearson, Recommendation, Similarity. I. IntroductIon Recommender systems are the systems that recommend an item or data using background data such as ratings from users of items and features of items in order to cope with information overload problem. This problem occurs when there is more data that exceed processing capability of the system [1]. As a result, it is hard to make effcient decisions [2]. On the other hand, people need intelligent techniques to flter and get relevant data. Recommender systems are applications that present intelligent suggestions on items and cope with information overload problem [1, 3, 4]. Recommender systems are classifed into four groups which are based on used techniques and information to make Article can be accessed online at http://www.publishingindia.com recommendations. These are collaborative fltering, content- based fltering, knowledge-based recommender systems, and hybrid recommendation systems [1, 5-7]. Collaborative fltering systems make recommendations only by using users’ ratings [8]. They assume that users with similar interests and opinions tend to prefer similar items [1]. These systems are classifed into two groups: memory-based and model-based algorithms. In the memory-based algorithm, similarities between users or items are calculated directly. On the other hand, in the model-based algorithm, frstly, predictive model from the user database is constructed, and secondly, it is used for making a prediction [9]. Content-based fltering systems recommend items with similar features that were much preferred by users in past [1]. Knowledge-based recommender systems make recommenda- tions based on the knowledge which is based on how certain item features and attributes satisfy users’ needs [1, 4]. Hybrid recommendation systems make recommendations using two or more techniques [1]. According to Gunawardana and Shani [10], recommendation systems have two tasks: the prediction task and the recommendation task. In the prediction task, user opinion such as ratings is predicted. In the recommendation task, relevant items are recommended to the users. Computer technology is getting more developed day by day. As a result, use of the internet for reading news or articles and e-commerce rates have been increasing. People want to get relevant data as soon as possible. Therefore, they need recommendation systems which recommend relevant data such as books, DVDs, clothes, news, or articles. In order to get relevant data and make good decisions, lot of studies are carried on and lot of recommendation systems are developed. Collaborative fltering recommendation systems have some challenges such as cold start problem, data sparsity, scalabil- ity, synonymy, gray sheep, and shilling attacks [11]. Cold start problem occurs when it is impossible to make a recommenda- Journal of Applied Information Science 7 (2), December 2019, 31-37 http://www.publishingindia.com/jais