Cold-start News Recommendation with Domain-dependent Browse Graph Michele Trevisiol 1,2 Luca Maria Aiello 1 Rossano Schifanella 3, * Alejandro Jaimes 1 1 Yahoo Labs, Barcelona, Spain {laiello,ajaimes}@yahoo-inc.com 2 Universitat Pompeu Fabra, Barcelona, Spain {trevisiol}@acm.org 3 Università degli Studi di Torino, Torino, Italy {schifane}@di.unito.it ABSTRACT Online social networks and mash-up services create oppor- tunities to connect different web services otherwise isolated. Specifically in the case of news, users are very much exposed to news articles while performing other activities, such as so- cial networking or web searching. Browsing behavior aimed at the consumption of news, especially in relation to the vis- its coming from other domains, has been mainly overlooked in previous work. To address that, we build a BrowseG- raph out of the collective browsing traces extracted from a large viewlog of Yahoo News (0.5B entries), and we define the ReferrerGraph as its subgraph induced by the sessions with the same referrer domain. The structural and tempo- ral properties of the graph show that browsing behavior in news is highly dependent on the referrer URL of the ses- sion, in terms of type of content consumed and time of con- sumption. We build on this observation and propose a news recommender that addresses the cold-start problem: given a user landing on a page of the site for the first time, we aim to predict the page she will visit next. We compare 24 fla- vors of recommenders belonging to the families of content- based, popularity-based, and browsing-based models. We show that the browsing-based recommender that takes into account the referrer URL is the best performing, achieving a prediction accuracy of 48% in conditions of heavy data sparsity. Categories and Subject Descriptors H.4 [Information Systems Applications]: Miscellaneous Keywords BrowseGraph, cold-start, news recommendation, browsing behavior, browsing sessions, recommender systems * This work has been performed when the author was a Visiting Sci- entist at Yahoo Labs, within the framework of the FREP grant. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full cita- tion on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. RecSys’14, October 6–10, 2014, Foster City, Silicon Valley, CA, USA. Copyright 2014 ACM 978-1-4503-2668-1/14/10 $15.00. http://dx.doi.org/10.1145/2645710.2645726. Figure 1: An article page from Yahoo News (compacted layout). Right rail boxes and the infinite-scroll section at the bottom allow the user to browse to other articles. 1. INTRODUCTION In recent years the consumption of online news has in- creased rapidly, in contrast with the decline of traditional newspapers 1 . Between 2009 and 2012 the percentage of users visiting news portals raised steadily up to the point to represent the major portion of overall Web traffic 2 , com- parable to the volume of visits to top domains like Google search 3 . For its importance, richness of content, and abun- dant user participation, the field of online news has become a crowded arena for research in several areas including re- trieval, ranking, recommendation, and personalization [6, 2, 30]. Despite the vast amount of work in the field, there are two aspects of news consumption that are still largely un- explored. First, modern online news providers have turned into more globally connected systems able to attract a wider audience than their core of regular users. News articles are very often shared on different external websites and social media platforms, thus providing a growing number of brows- ing shortcuts to news portals. To mention two examples, modern search engines serve queries relevant to news stories by directly featuring news articles from major providers, and social media are increasingly used as daily tools for journal- 1 http://stateofthemedia.org/2012/overview-4/key-findings/ 2 http://www.people-press.org/2012/09/27/section-2-online-and- digital-news-2/ 3 http://www.theguardian.com/news/datablog/2012/jun/22/website- visitor-statistics-nielsen-may-2012-google