Popular Topical Authors in Brazilian Blogosphere using comments as Relationships Henrique D.P. Santos, Leandro Krug Wives PPGC/UFRGS, Brazil {hdpsantos, wives}@inf.ufrgs.br Abstract. Mining the Web is a current trend for discovering latent information. Blogs are great sources for new insights, since they are an easy platform for publishing texts online. In this context, it is important to find which authors are most popular for the community in a given topic. This paper presents a study that considers Brazilian bloggers in the Blogspot platform to discover popular topical authors. We have collected information from 30 million posts of 2011 and considered users’ comments as the source for establishing relationships among bloggers. In our experiments, first we identify the community around the topic of interest and, then, using our proposed Topic PageRank algorithm, we are able to build a list with the most popular authors in a topic. Additionally, our method results in a better ranking than the original PageRank. We also characterize the database with a survey of over 4 thousands authors. Categories and Subject Descriptors: H.2.8 [Database Management]: Database Applications, Data Mining; H.3.3 [Information Storage and Retrieval]: Retrieval models, Selection process Keywords: Data Mining, Brazilian Blogosphere, Social Network Analysis, Topic Author Popularity, Link Analysis 1. INTRODUCTION In recent years, blogging has become a trend among people who publish content on the Web. With any simple platform to create blogs, users can rapidly share their daily diaries, discuss the latest news, and express their opinions. As a growing online media, many-blog hosting sites now provide free services. Given this convenient platform, the number of blogs is quickly increasing, and the set of all existing blogs is known as the Blogosphere [Agarwal and Liu 2008]. A typical blog consists of a title, related subscription information, and multiple posts, which are displayed in descending order of their publishing date. A general blog post consists of text that is combined with a post date, hyperlinks, images and other media. The user who owns a blog and writes on it is known as a blogger. In a blog, other bloggers can comment on the posts contained in it. Besides, bloggers can track other blogs, which means that they are interested in the topics of these blogs. Bloggers can also add favorite blogs to their blogrolls, which are components listed on the front page of a blog which indicate the subscriptions or links the blogger likes the most. These are the most common interactions among bloggers. In addition to those, hyperlinks, contained within a blog post, give additional information for readers who would like to read some related news, or blog posts. If someone wants to mine blogs, these are the kind of data that can be explored (i.e., content and relationships). According to Adams, blogs are decentralized and dynamic, often responding to real-world events ahead of the mainstream media. Crucially, they are built upon implicit trust networks, which create an atmosphere of accountability, turning the Blogosphere into a good thermometer and propagator of users’ opinions. This fact created huge commercial interest from big media players and search companies (such as Google, Yahoo!, News Corp, etc.), whose initial application is related to product feedback and viral marketing [Adams et al. 2010]. , Vol. 3, No. 3, October 2012, Pages 1–0??.