Linked Open Data-enabled Strategies for Top-N Recommendations Cataldo Musto Dept. of Computer Science Univ. of Bari Aldo Moro, Italy cataldo.musto@uniba.it Pierpaolo Basile Dept. of Computer Science Univ. of Bari Aldo Moro, Italy pierpaolo.basile@uniba.it Pasquale Lops Dept. of Computer Science Univ. of Bari Aldo Moro, Italy pasquale.lops@uniba.it Marco de Gemmis Dept. of Computer Science Univ. of Bari Aldo Moro, Italy marco.degemmis@uniba.it Giovanni Semeraro Dept. of Computer Science Univ. of Bari Aldo Moro, Italy giovanni.semeraro@uniba.it ABSTRACT The huge amount of interlinked information referring to dif- ferent domains, provided by the Linked Open Data (LOD) initiative, could be effectively exploited by recommender sys- tems to deal with the cold-start and sparsity problems. In this paper we investigate the contribution of several features extracted from the Linked Open Data cloud to the accuracy of different recommendation algorithms. We focus on the top-N recommendation task in presence of binary user feedback and cold-start situations, that is, predicting ratings for users who have a few past ratings, and predicting ratings of items that have been rated by a few users. Results show the potential of Linked Open Data-enabled approaches to outperform existing state-of-the-art algorithms. Categories and Subject Descriptors H.3.3 [Information Systems]: Information Search and Re- trieval Keywords Content-based Recommender Systems; Top-N recommenda- tions; Implicit Feedback; Linked Open Data; DBpedia 1. INTRODUCTION Recently, novel and more accessible forms of information coming from different open knowledge sources represent a rapidly growing piece of the big data puzzle. Over the last years, more and more semantic data are pub- lished following the Linked Data principles 1 , by connecting information referring to geographical locations, people, com- panies, book, scientific publications, films, music, TV and 1 http://www.w3.org/DesignIssues/LinkedData.html CBRecSys 2014, October 6, 2014, Silicon Valley, CA, USA. Copyright 2014 by the author(s). radio programs, genes, proteins, drugs, online communities, statistical data, and reviews in a single global data space, the Web of Data [2]. This information, interlinked with each other, forms a global graph called Linked Open Data cloud, whose nucleus is represented by DBpedia 2 . A fragment of the Linked Open Data cloud is depicted in Figure 1. Figure 1: Fragment of the Linked Open Data cloud (as of September 2011). Using open or pooled data from many sources, often com- bined and linked with proprietary big data, can help develop insights difficult to uncover with internal data alone [4], and can be effectively exploited by recommender systems to deal with classical problems of cold-start and sparsity. On the other hand, the use of a huge amount of inter- linked data poses new challenges to recommender systems researchers, who have to find effective ways to integrate such knowledge into recommendation paradigms. This paper presents a preliminary investigation in which we propose and evaluate different ways of including several kinds of Linked Open Data features in different classes of recommendation algorithms. The evaluation is focused on the top-N recommendations task in presence of binary user feedback and cold-start situations. 2 http://dbpedia.org 49 Copyright 2014 for the individual papers by the paper’s authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors. CBRecSys 2014, October 6, 2014, Silicon Valley, CA, USA.