User-generated Content Curation with Deep Convolutional Neural Networks Ruben Tous * , Otto Wust † , Mauro Gomez † , Jonatan Poveda † , Marc Elena † Jordi Torres *‡ , Mouna Makni * and Eduard Ayguad´ e *‡ * Universitat Polit` ecnica de Catalunya (UPC). Barcelona, Spain † Adsmurai. Barcelona, Spain ‡ Barcelona Supercomputing Center (BSC). Barcelona, Spain Email: {rtous; mmakni, lcruz}@ac.upc.edu,{mauro, jonatan, otto @adsmurai.com, {jordi.torres; eduard.ayguade}@bsc.es Abstract—In this paper, we report a work consisting in using deep convolutional neural networks (CNNs) for curating and filtering photos posted by social media users (Instagram and Twitter). The final goal is to facilitate searching and discovering user-generated content (UGC) with potential value for digital marketing tasks. The images are captured in real time and automatically annotated with multiple CNNs. Some of the CNNs perform generic object recognition tasks while others perform what we call visual brand identity recognition. We report experiments with 5 real brands in which more than 1 million real images were analyzed. In order to speed-up the training of custom CNNs we applied a transfer learning strategy. I. I NTRODUCTION Instagram users share more than 80 million photos per day, captured from all corners of the earth. Twitter users post more than 500 million tweets each day, from which a 7% contain images. A significant part of this visual user- generated content has potential value for digital marketing tasks. On the one hand, users’ photos can be analyzed to obtain knowledge about users behavior and opinions in gen- eral, or with respect to a certain products or brands. On the other hand, some users’ photos can be of value themselves, as original and authentic content that can be used, upon users’ permission, in the different brands’ communication channels. This work is related to this second use case, searching, discovering and exploiting user-generated content (UGC) for digital marketing tasks, that has been traditionally addressed by the so-called content curation technologies. Discovering valuable images on social media streams is challenging. The potential bandwidth to analyze is huge and, while they help, user defined tags are scarce and noisy. The most part of current solutions rely on costly manual curation tasks over random samples. This way many contents are not even processed, and many valuable photos go unnoticed. We propose using deep convolutional neural networks (CNNs) to minimize manual curation as much as possible and to make it more efficient. As a result, we increase the number of photos processed several orders of Figure 1. Example images posted by Instagram users and tagged with Desigual’s promotional hashtags (e.g. #lavidaeschula) magnitude, we increase the quality of the resulting photos (as more photos are analyzed and only the best ones go through manual curation), we enable near real-time discovery and, last but not least, we drastically reduce the cost. The way we do this is automatically tagging the incoming images with multiple CNNs. Some of the CNNs perform generic object recognition tasks and annotate the images with tags that describe their semantics (e.g. ”beach”, ”car”, etc.). Other CNNs perform what we call visual brand identity (VBI) recognition. Given a brand, we train a model with im- ages that it has used in its previous marketing campaigns and that are representative of the brand’s visual identity. Given a campaign for a certain brand, we use the corresponding VBI CNN to automatically pre-select images that fit the visual identity of that brand. As a final step, a human expert performs a final selection with the help of a search interface