Monetizing User Activity on Social Networks - Challenges and Experiences Meenakshi Nagarajan ∗ , Kamal Baid † , Amit Sheth ∗ and Shaojun Wang ∗ ∗ kno.e.sis center,Wright State University, Dayton, OH, USA; meena, amit, shaojun @ knoesis.org † Dept. of Computer Science and Eng., Indian Institute of Technology, Guwahati, India; b.kamal@iitg.ernet.in Abstract—This work summarizes challenges and experiences in monetizing user activity on public forums on social network sites. We present a approach that identifies the monetization potential of user posts and eliminates off-topic content to identify the most relevant and monetizable keywords for advertising. Preliminary studies using data from MySpace and Facebook show that 52% of ad impressions generated using keywords from our system were more targeted compared to the 30% relevant impressions generated without using our system. Keywords- user activity, off-topic noise, contentextual ads I. I NTRODUCTION Current advertising approaches to monetizing user content on social networks are profile-based contextual advertise- ments, demographic-based ads or a combination of the two. Demographic-based ads target an individual by age, gender and location information. Profile-based ads exploit infor- mation such as interests and activities on user profiles for delivering ads. Profile-based ads are a type of content-based ads that are generated by automatically finding relevant keywords on a page and displaying ads based on those keywords. Content-based ad delivery was made popular on the Web where ads matched content that a user was viewing on a web page. Not surprisingly, this model was a good contender for social networking sites (SNSs) where ads need to be highly targeted to the content in order to trump the value of networking. However, the utility of ad-models proposed to date on SNSs is not yet apparent to its members. Besides issues of trust, privacy and scattered user attention on SNSs, the content that is being exploited for ad generation is also an important point of concern. While profile informa- tion might be useful for launching product campaigns and micro-targeting customers, it does not necessarily contain current interests or purchase intents. Ads generated from such content are inherently less relevant to a user. Over time, this leads to a scenario where ad campaigns see several ad impressions but very few clickthroughs. With the growing popularity of online social networks, members are extensively using public venues like forums, marketplaces and groups to seek opinions from peers, write about things they bought, offering advice and so on. Intents expressed on these venues are often times representative of a user’s current needs and in several cases, monetizable. Content from public forums is also less likely to be a target of privacy concerns, given that posted content is not personal information. In this work, we posit that in addition to using profile information, ad programs should generate profile ads (ads shown on a user profile) from user activity on public venues on SNSs. While the intuition is rather straight- forward, there are challenges that need to be addressed before such content can be used for monetization. 1. User Intentions: Users scribe on SNSs with different intentions. For example, Post 1 below shows a clear trans- actional intent, while Post 2 shares an opinion. Post 1: i am looking for a 32 GB iTouch. cheaper then what apple sells it Post 2: MSoffices convinient to use once you have the softwares put in The problem of identifying user intents in such free-text is not the same as that of identifying intentions behind web queries. We observed that unlike web search, the use of certain entity types does not accurately classify a post’s intents. For example, the presence of a product name does not immediately imply navigational or transactional intents. The same product X can appear with several user intentions on SNSs - ‘i am thinking of getting X’ (transactional); ‘i like my new X’ (information sharing); and ‘what do you think about X’ (information seeking). Among the many footprints users leave on a site, it is important to identify those with high monetization potention, so as to generate profile ads that they are more likely to click. 2. Informal content and off-topic noise: A characteristic of communities, both online and offline, is the shared understanding and context they operate on. Use of slangs and variations of entity names, such as puters for computers and Skik3 for the product Sidekick3 are commonplace. Not being able to spot such keywords in posts will mean fewer matched ad impressions. Additionally, due to the interac- tional nature of social networking platforms, when users share information, they are typically sharing an experience or event. The main message is overloaded with information that is off-topic for the task of advertising. Consider this post from the Computers forum on MySpace. Not eliminating noisy keywords like ‘Merrill Lynch’ and ‘food poisoning’ will potentially result in ads unrelated to the post. I NEED HELP WITH SONY VEGAS PRO 8!! Ugh and i have a video project due tomorrow for merrill lynch ..and i got food poisoning from eggs. its not fun. help? Contributions: Here we present our experiences with build- ing a system that (a) identifies monetizable user activity or posts and (b) eliminates off-topic noise in these user posts, so only the most relevant keywords are used for generating