Crowd vs. Experts: Nichesourcing for Knowledge Intensive Tasks in Cultural Heritage Jasper Oosterman, Alessandro Bozzon, Geert-Jan Houben Delft University of Technology Delft, The Netherlands j.e.g.oosterman, a.bozzon, g.j.p.m.houben@tudelft.nl Archana Nottamkandath, Chris Dijkshoorn, Lora Aroyo VU University Amsterdam Amsterdam, The Netherlands a.nottamkandath, c.r.dijkshoorn, lora.aroyo@vu.nl Mieke H. R. Leyssen, Myriam C. Traub Centrum Wiskunde & Informatica Amsterdam, The Netherlands mieke.leyssen, myriam.traub@cwi.nl ABSTRACT The results of our exploratory study provide new insights to crowdsourcing knowledge intensive tasks. We designed and performed an annotation task on a print collection of the Rijksmuseum Amsterdam, involving experts and crowd workers in the domain-specific description of depicted flow- ers. We created a testbed to collect annotations from flower experts and crowd workers and analyzed these in regard to user agreement. The findings show promising results, demonstrating how, for given categories, nichesourcing can provide useful annotations by connecting crowdsourcing to domain expertise. Categories and Subject Descriptors H.4 [Information Systems Applications]: Miscellaneous Keywords Crowdsourcing; Nichesourcing; Cultural Heritage; Tagging; Knowledge Intensive Tasks 1. INTRODUCTION The Rijksmuseum Amsterdam 1 has a collection of 700.000 prints depicting birds, flowers, castles, people, etc. Due to time and knowledge constraints, their professional annota- tors annotate depicted elements using broad terms like bird or flower. To go beyond general terms, people with domain expertise need to be found and engaged, a process called nichesourcing [1]. Enrichment of Cultural Heritage collections has been the target of previous research. The “Your Paintings” project aims at digitizing and annotating 200.000 publicly owned oil paintings in the UK [2]. The Steve project [4] studied crowd 1 http://rijksmuseum.nl Copyright is held by the International World Wide Web Conference Com- mittee (IW3C2). IW3C2 reserves the right to provide a hyperlink to the author’s site if the Material is used in electronic media. WWW’14 Companion, April 7–11, 2014, Seoul, Korea. ACM 978-1-4503-2744-2/14/04. http://dx.doi.org/10.1145/2567948.2576960. tagging of collections from more than 12 USA-based mu- seums and compared crowd and professional taggers. The Netherlands Institute for Sound and Vision studied crowd tagging of heritage videos using a game called WAISDA [3]. However, these initiatives do not focus on knowledge inten- sive tasks. In this paper we show the results of an exploratory study focussing on a knowledge intensive task: the annotation of prints (lithographies) depicting flowers from the Rijksmu- seum. Annotating such prints requires: time, to properly inspect the content of the print; skills, to correctly iden- tify flowers; and knowledge, to correctly specify the (botan- ical ) name of the depicted flowers. Other complications are that prints often lack colors and detail, and depict styl- ized/abstract or even fantasy sceneries. Crowdsourcing plat- forms such as Amazon Mechanical Turk allow us to reach out to a large amount of potential crowd annotators. In our study we try to answer the following questions: How can crowd annotators provide useful annotations for knowledge intensive tasks? What is the relation between task difficulty and crowd annotation behavior? The contributions from this exploratory study include a analysis of crowd and expert annotations for flower prints in the Rijksmuseum collection, and a dataset with expert and crowd annotations to be used for further study. 2. EXPERIMENTAL SETUP Our dataset consists of 86 prints depicting one or more flowers from the Rijksmuseum Amsterdam. We classified each print along two dimensions: the print depicts multiple flowers or a single one and the depicted flower(s) can be prominent (main element of the artwork) or non-prominent (detail). In the collection are 8 Single Prominent (SP), 9 Multiple Prominent (MP), 16 Single Non-prominent (SNP) and 53 Multiple Non-prominent (MNP) prints. The experiment addressed two target populations: per- sons with known domain expertise (experts) and anony- mous workers drawn from crowdsourcing platforms (crowd workers). Our efforts resulted in 4 responding experts. Crowd workers were recruited by posting tasks on multiple crowdsourcing platforms: Amazon Mechanical Turk, Point Dollars, and Vivatic resulting in 75 crowd workers. Crowd 567