1 Analysis of Recommender Systems’ Algorithms Emmanouil Vozalis, Konstantinos G. Margaritis Abstract — In this work, we will provide a brief review of diﬀerent recommender systems’ algorithms, which have been proposed in the recent literature. First, we will present the basic recommender systems’ challenges and problems. Then, we will give an overview of association rules, memory- based, model-based and hybrid recommendation algorithms. Finally, evaluation metrics to measure the performance of those systems will be discussed. Keywords — Collaborative Filtering, Recommender Sys- tems, Machine Learning I. Introduction R ECOMMENDER Systems were introduced as a computer-based intelligent technique to deal with the problem of information and product overload. They can be utilized to eﬃciently provide personalized services in most e-business domains, beneﬁting both the customer and the merchant. Recommender Systems will beneﬁt the customer by making to him suggestions on items that he is assum- ably going to like. At the same time, the business will be beneﬁted by the increase of sales which will normally occur when the customer is presented with more items he would likely ﬁnd appealing. The two basic entities which appear in any Recom- mender System are the user (sometimes also referred to as customer) and the item (also referred to as product in the bibliography). A user is a person who utilizes the Recom- mender System providing his opinion about various items and receives recommendations about new items from the system. The input to a Recommender System depends on the type of the employed ﬁltering algorithm. Various ﬁltering algorithms will be discussed in subsequent sections. Gen- erally, the input belongs to one of the following categories: 1. Ratings (also called votes), which express the opinion of users on items. Ratings are normally provided by the user and follow a speciﬁed numerical scale (example: 1-bad to 5-excellent). A common rating scheme is the binary rating scheme, which allows only ratings of either 0 or 1. Ratings can also be gathered implicitly from the user’s purchase history, web logs, hyperlink visits, browsing habits or other types of information access patterns. 2. Demographic data, which refer to information such as the age, the gender and the education of the users. This kind of data is usually diﬃcult to obtain. It is normally collected explicitly from the user. 3. Content data, which are based on a textual analysis of documents related to the items rated by the user. The features extracted by this analysis are used as input to the ﬁltering algorithm in order to infer a user proﬁle. The authors are with the Parallel and Distributed Processing Lab- oratory, Department of Applied Informatics, University of Macedo- nia, Thessaloniki, Greece. E-mail: {mans, kmarg}@uom.gr. Web: http://macedonia.uom.gr/ {mans, kmarg}. The goal of Recommender Systems is to generate sugges- tions about new items or to predict the utility of a spe- ciﬁc item for a particular user. In both cases the pro- cess is based on the input provided, which is related to the preferences of that user. Let m be the number of users U = {u 1 ,u 2 , ..., u m } and n the number of items I = {i 1 ,i 2 , ..., i n }. Each user u i , where i =1, 2, ..., m, has a list of items I ui for which he has expressed his opinion about. It is important to note that I ui ⊆ I , while it is also possible for I ui to be the null set, meaning that users are not required to reveal their preferences for all existing items. Also, the count of items in I ui is n i , or n i = |I ui |, with n i ≤ n. User opinions are generally stated in the form of a rating score. Speciﬁcally, the rating of user u i for item i j , where j =1, 2, ..., n, is denoted by r i,j , where each rating is either a real number within the agreed nu- merical scale or ⊥, the symbol for ”no rating”. All these available ratings are collected in a m x n user-item matrix, denoted by R. The proposed ﬁltering algorithms employ various techniques either on the rows, which correspond to ratings of a single user about diﬀerent items, or on the columns, which correspond to diﬀerent users’ ratings about a single item, of this user-item matrix. We distinguish a single user u a ∈ U as the active user and deﬁne NR ∈ I as the subset of items for which the active user has not stated his opinion yet, and as a result, for which the Recommender System should generate suggestions. The output of a Recommender System can be either a Prediction or a Recommendation. • A Prediction is expressed as a numerical value, r a,j , which represents the anticipated opinion of active user u a for item i j . This predicted value should necessarily be within the same numerical scale (example: 1-bad to 5- excellent) as the input referring to the opinions provided initially by active user u a . This form of Recommender Sys- tems output is also known as Individual Scoring. • A Recommendation is expressed as a list of N items, where N ≤ n, which the active user is expected to like the most. The usual approach in that case requires this list to include only items that the active user has not already purchased, viewed or rated. This form of Recommender Systems output is also known as Top-N Recommendation or Ranked Scoring. II. Challenges and Problems In this section we will discuss the fundamental problems that Recommender Systems suﬀer from. It is important for each new ﬁltering algorithm proposed to suggest solutions for those problems. Quality of Recommendations: Trust is the key word here. Customers need recommendations, which they can trust. To achieve that, a recommender system should minimize