Users and Noise: The Magic Barrier of Recommender Systems Alan Said † , Brijnesh J. Jain † , Sascha Narr, Till Plumbaum Technische Universit¨ at Berlin DAI Lab {alan, jain, narr, till}@dai-lab.de † Both authors contributed equally to this work Abstract. Recommender systems are crucial components of most commercial web sites to keep users satisfied and to increase revenue. Thus, a lot of effort is made to improve recommendation accuracy. But when is the best possible perfor- mance of the recommender reached? The magic barrier, refers to some unknown level of prediction accuracy a recommender system can attain. The magic barrier reveals whether there is still room for improving prediction accuracy, or indicates that any further improvement is meaningless. In this work, we present a mathe- matical characterization of the magic barrier based on the assumption that user ratings are afflicted with inconsistencies - noise. In a case study with a commer- cial movie recommender, we investigate the inconsistencies of the user ratings and estimate the magic barrier in order to assess the actual quality of the recom- mender system. Keywords: Recommender Systems, Noise, Evaluation Measures, User Inconsistencies 1 Introduction Recommender systems play an important role in most top-ranked commercial websites such as Amazon, Netflix, Last.fm or IMDb [10]. The goal of these recommender sys- tems is to increase revenue and present personalized user experiences by providing sug- gestions for previously unknown items that are potentially interesting for a user. With the growing amount of data in the Internet, the importance of recommender systems increases even more to guide users through the mass of data. The key role of recommender systems resulted in a vast amount of research in this field, which yielded a plethora of different recommender algorithms [1, 4, 8]. An ex- ample of a popular and widely used approach to recommenders is collaborative filter- ing. Collaborative filtering computes user-specific recommendations based on historical user data, such as ratings or usage patterns [4, 7]. Other approaches include content- based recommenders (recommend items based on properties of a specific item), social recommenders (recommend things based on the past behavior of similar users in the social network) or hybrid combinations of several different approaches. To select an appropriate recommender algorithm, and adapt it to a given scenario or problem, the algorithms are usually examined by testing their performance using ei- ther artificial or real test data reflecting the problem. The best performing algorithm and