DISCOVERING NEXT GENERATION PRODUCT INNOVATIONS BY IDENTIFYING LEAD USER PREFERENCES EXPRESSED THROUGH LARGE SCALE SOCIAL MEDIA DATA Suppawong Tuarob Computer Science and Engineering The Pennsylvania State University University Park, PA 16802 Email: suppawong@psu.edu Conrad S. Tucker Engineering Design and Industrial and Manufacturing Engineering Computer Science and Engineering The Pennsylvania State University University Park, PA 16802 Email: ctucker4@psu.edu ABSTRACT An innovative consumer (a.k.a. a lead user) is a consumer of a product that faces needs unknown to the public. Innova- tive consumers play important roles in the product development process as their ideas tend to be innovatively unique and can be potentially useful for development of next generation, innovative products that better satisfy the market needs. Oftentimes, con- sumers portray their usage experience and opinions about prod- ucts and product features through social networks such as Twitter and Facebook, making social media a viable, rich in information, and large-scale source for mining product related information. The authors of this work propose a data mining methodology to automatically identify innovative consumers from a heteroge- neous pool of social media users. Specifically, a mathematical model is proposed to identify latent features (product features unknown to the public) from social media data. These latent fea- tures then serve as the key to discover innovative users from the ever increasing pool of social media users. A real-world case study, which identifies smartphone lead users in the pool of Twit- ter users, illustrates promising success of the proposed models. 1 Introduction It has long been believed that consumers exist as the end of product chains, merely to buy and consume what producers create, while the companies are the sole entities involving in the product development [40]. However, multiple research studies such as [5, 23, 38, 39] have shown that this innovation paradigm is no longer true – consumers themselves are actually the source of the innovation reflected in todays’ products in the market space [12, 19, 24, 28]. Recently, an increasing number of compa- nies have altered their product innovation paradigms by making consumers the center of product development, rather than seeing consumers as the market [4]. These innovative users have been shown to be able to identify further needs beyond the products in market space can satisfy. Such needs are often converted to potential product development ideas that could be incorporated in future products. For example, 3M assembled a team of lead users which included a veterinarian surgeon, a makeup artist, doctors from developing countries and military medics 1 . The recruited lead users then brain-stormed their ideas in a two-and- half day workshop. As a successful result, 3M initiated 3 product lines (i.e. Economy, Skin Doctor, and Armor lines) which was shown to yield eight time more profitable than using the tradi- tional product development method [17]. However, a drawback of such consumer-innovator paradigms is that only a fraction of consumers have the potential to generate innovative ideas useful for development of the target products. This makes the selec- tion of such innovative consumers (a.k.a. lead users) an early challenging task that requires huge amounts of both time and fi- nancial resources. Society generates more than 2.5 quintillion (10 18 ) bytes of data each day [42]. A substantial amount of this data is gener- ated through social media services such as Twitter, Facebook, and Google that process anywhere between 12 terabytes (10 12 ) to 20 petabytes (10 15 ) of data each day [1]. Social media al- lows its users to exchange information in a dynamic, seamless manner almost anywhere and anytime. Knowledge extracted 1 http://www.leaduser.com/ 1 Copyright c 2014 by ASME Proceedings of the ASME 2014 International Design Engineering Technical Conferences & Computers and Information in Engineering Conference IDETC/CIE 2014 August 17-20, 2014, Buffalo, New York, USA DETC2014-34767