Learning User Characteristics from Social Tagging Behavior Karin Schöfegger Knowledge Management Institute Graz University of Technology Graz, Austria k.schoefegger@gmail.com Christian Körner Knowledge Management Institute Graz University of Technology Graz, Austria christian.koerner@tugraz.at Philipp Singer Knowledge Management Institute Graz University of Technology Graz, Austria philipp.singer@tugraz.at Michael Granitzer Chair of Media Informatics University of Passau Passau, Germany Michael.Granitzer@uni- passau.de ABSTRACT In social tagging systems the tagging activities of users leave a huge amount of implicit information about them. The users choose tags for the resources they annotate based on their interests, background knowledge, personal opinion and other criteria. Whilst existing research in mining social tag- ging data mostly focused on e.g., gaining a deeper under- standing of the user’s interests and the emerging structures in those systems, little work has yet been done to use the rich implicit information in tagging activities to unveil to what degree users’ tags convey information about their back- ground. The automatic inference of user background infor- mation can be used to complete user profiles which in turn supports various recommendation mechanisms. This work illustrates the application of supervised learning mechanisms to analyze a large online corpus of tagged academic literature for extraction of user characteristics from tagging behavior. As a representative example of background characteristics we mine the user’s research discipline. Our results show that tags convey rich information that can help designers of those systems to better understand and support their prolific users - users that tag actively - beyond their interests. Categories and Subject Descriptors H.1.2 [User/Machine Systems]: Human Factors; H.1.2 [Information Systems]: Models and Principles—Human information processing ; H.4 [Information Systems Ap- plications]: Miscellaneous General Terms Algorithms, Human Factors Keywords tagging, user background, social software 1. INTRODUCTION Tagging provides an easy and intuitive way for users to an- notate, organize and re-find resources. For this reason a huge number of systems have added tagging functionality. To give some examples: Delicious is a social bookmarking platform that enables users to apply tags in order to or- ganize their websites, YouTube allows content creators to assign tags such that their videos can be found more easily later on and Mendeley a platform were users can annotate their papers by using tags. While in recent years a lot of re- search investigated social tagging systems and the resulting folksonomic structure or mined the user’s interests to sup- port personalization in those systems, there still exists lit- tle information about how users’ background (e.g., research discipline, gender, location, ...) becomes manifested in their used tags. This is mainly due to a lack of profile informa- tion in social tagging datasets (see the call for social tagging datasets by K¨ orner et al. [4] for details). Subsequently lit- tle is known about how users’ background information is reflected in the tags used. In this work we explore to what degree users’ tags convey information about their characteristics in the setting of so- cial tagging systems for academic publications. According to Brusilovsky et al. [2] or Webb et al. [11], one of the most pop- ular user characteristics modeled in adaptive hypermedia as well as web personalization besides user interest and knowl- edge is personal information (or user background). This user background is usually defined as rather static informa- tion such as demographics (e.g., name, age, gender, location) or information about the user’s profession, area of work, job responsibilities etc. This background information is usually provided explicitly by the user. In the academic setting, a user’s research discipline is an important background information to improve e.g., recom- mendations and information retrieval, which is why we chose