1 Analysis of Recommender Systems’ Algorithms Emmanouil Vozalis, Konstantinos G. Margaritis Abstract — In this work, we will provide a brief review of different recommender systems’ algorithms, which have been proposed in the recent literature. First, we will present the basic recommender systems’ challenges and problems. Then, we will give an overview of association rules, memory- based, model-based and hybrid recommendation algorithms. Finally, evaluation metrics to measure the performance of those systems will be discussed. Keywords — Collaborative Filtering, Recommender Sys- tems, Machine Learning I. Introduction R ECOMMENDER Systems were introduced as a computer-based intelligent technique to deal with the problem of information and product overload. They can be utilized to efficiently provide personalized services in most e-business domains, benefiting both the customer and the merchant. Recommender Systems will benefit the customer by making to him suggestions on items that he is assum- ably going to like. At the same time, the business will be benefited by the increase of sales which will normally occur when the customer is presented with more items he would likely find appealing. The two basic entities which appear in any Recom- mender System are the user (sometimes also referred to as customer) and the item (also referred to as product in the bibliography). A user is a person who utilizes the Recom- mender System providing his opinion about various items and receives recommendations about new items from the system. The input to a Recommender System depends on the type of the employed filtering algorithm. Various filtering algorithms will be discussed in subsequent sections. Gen- erally, the input belongs to one of the following categories: 1. Ratings (also called votes), which express the opinion of users on items. Ratings are normally provided by the user and follow a specified numerical scale (example: 1-bad to 5-excellent). A common rating scheme is the binary rating scheme, which allows only ratings of either 0 or 1. Ratings can also be gathered implicitly from the user’s purchase history, web logs, hyperlink visits, browsing habits or other types of information access patterns. 2. Demographic data, which refer to information such as the age, the gender and the education of the users. This kind of data is usually difficult to obtain. It is normally collected explicitly from the user. 3. Content data, which are based on a textual analysis of documents related to the items rated by the user. The features extracted by this analysis are used as input to the filtering algorithm in order to infer a user profile. The authors are with the Parallel and Distributed Processing Lab- oratory, Department of Applied Informatics, University of Macedo- nia, Thessaloniki, Greece. E-mail: {mans, kmarg}@uom.gr. Web: http://macedonia.uom.gr/ {mans, kmarg}. The goal of Recommender Systems is to generate sugges- tions about new items or to predict the utility of a spe- cific item for a particular user. In both cases the pro- cess is based on the input provided, which is related to the preferences of that user. Let m be the number of users U = {u 1 ,u 2 , ..., u m } and n the number of items I = {i 1 ,i 2 , ..., i n }. Each user u i , where i =1, 2, ..., m, has a list of items I ui for which he has expressed his opinion about. It is important to note that I ui I , while it is also possible for I ui to be the null set, meaning that users are not required to reveal their preferences for all existing items. Also, the count of items in I ui is n i , or n i = |I ui |, with n i n. User opinions are generally stated in the form of a rating score. Specifically, the rating of user u i for item i j , where j =1, 2, ..., n, is denoted by r i,j , where each rating is either a real number within the agreed nu- merical scale or , the symbol for ”no rating”. All these available ratings are collected in a m x n user-item matrix, denoted by R. The proposed filtering algorithms employ various techniques either on the rows, which correspond to ratings of a single user about different items, or on the columns, which correspond to different users’ ratings about a single item, of this user-item matrix. We distinguish a single user u a U as the active user and define NR I as the subset of items for which the active user has not stated his opinion yet, and as a result, for which the Recommender System should generate suggestions. The output of a Recommender System can be either a Prediction or a Recommendation. A Prediction is expressed as a numerical value, r a,j , which represents the anticipated opinion of active user u a for item i j . This predicted value should necessarily be within the same numerical scale (example: 1-bad to 5- excellent) as the input referring to the opinions provided initially by active user u a . This form of Recommender Sys- tems output is also known as Individual Scoring. A Recommendation is expressed as a list of N items, where N n, which the active user is expected to like the most. The usual approach in that case requires this list to include only items that the active user has not already purchased, viewed or rated. This form of Recommender Systems output is also known as Top-N Recommendation or Ranked Scoring. II. Challenges and Problems In this section we will discuss the fundamental problems that Recommender Systems suffer from. It is important for each new filtering algorithm proposed to suggest solutions for those problems. Quality of Recommendations: Trust is the key word here. Customers need recommendations, which they can trust. To achieve that, a recommender system should minimize