Structured vs. unstructured tagging – A case study Judit Bar-Ilan, Snunith Shoham, Asher Idan, Yitzchak Miller and Aviv Shachak Department of Information Science, Bar-Ilan University Ramat-Gan, Israel 972-3-5318351 {barilaj,shohas}@mail.biu.ac.il, asherid@netvision.net.il, milleri@mail.biu.ac.il and shachaa4@biu.013.net.il ABSTRACT In this paper we describe and discuss a tagging experiment of images related to Israeli and Jewish cultural heritage. The first group of participants was asked to assign the images tags that describe them, while the second group was asked to provide free- text values to predefined metadata elements. The results show that on the one hand structured tagging provides guidance to the users, but on the other hand different interpretations of the meaning of the elements may worsen the tagging quality instead of improving it. Our recommendation is to experiment with a system where the users provide both the tags and the context of these tags. Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing – indexing methods. General Terms Experimentation Keywords Structured tagging, unstructured tagging, images, cultural heritage 1. INTRODUCTION Image tagging on the Web has recently become extremely popular (see for example the popular photo sharing and tagging services Flickr (http://flickr.com/) and Smugmug (http://www.smugmug.com/)). Although we do not challenge the saying, that “a picture is worth a thousand words”, currently the best method to retrieve pictures is based on textual descriptions. Image processing and recognition are very active research fields, but there are no well-developed commercial systems that retrieve images based on image recognition. Yahoo! and Google image searches are also based on the text near the <img> tag in the html files. The popular tagging systems usually do not impose any limitations on the choice of tags (del.icio.us allows only single word tags, but otherwise there are no limitations). In addition to free tagging, there are highly structured metadata systems as well. One example is VRA Core Version 3 (http://www.vraweb.org/vracore3.htm) is a very detailed metadata element set enabling the creation of records “to describe works of visual culture as well as the images that document them” [5]. The VRA metadata set defines the elements (fields in the traditional librarian jargon), which should be given values (filled in). The recommendation is to use controlled vocabularies, especially the Getty vocabularies [2]. Not only the use of controlled vocabularies is recommended, but specific rules apply for assigning values to the specific fields (e.g., date formats and rules for recording personal names). The VRA metadata set was created mainly for use in museums, thus it is to be filled in by professionals. A simpler set of metadata elements is the Dublin Core [1] which is intended to be used by the general public (especially the simple version of the Dublin Core). The central aim of the developers of the DC (Dublin Core) was to substantially improve “resource discovery capabilities by enabling field-based (e.g., author, title) searches, permitting indexing of non-textual objects” [6]. The Dublin Core has a simple and a qualified version. Although it recommends the use of different controlled vocabularies for some of the elements, these are only recommendations and all fields can be assigned free-text values. The basic philosophy of the DC and of the librarian world in general is that field-based resource discovery is much more effective than free text search. An often mentioned example is that field-based search enables one to differentiate between books written by Shakespeare and books written about Shakespeare. Library catalogs and bibliographic databases are well-known examples of this approach. As an initial phase of a larger project, we conducted a small experiment to compare field based descriptions of images with free tags of the same images. 2. THE EXPERIMENT The goal of this experiment was to compare free text tagging with what we call structured tagging, i.e. assigning free text keywords (tags) to predefined metadata elements (fields). Twelve images related to Jewish and Israeli cultural heritage were chosen (see Figure 1). For each image, the Web page from which the image was taken was given, in order to provide some context to the picture. For example the first picture was taken outside the building in which Israel’s independence was declared in 1947. The page, http://www.knesset.gov.il/docs/eng/megilat_eng.htm provides clear context in this case. For some of the other images, the context is much less clear, for example, picture 11 depicts part of the arch of Titus in Rome. The picture was taken from http://www.biblelight.net/temple.htm, where the specific picture serves mostly as an illustration. Picture 5 is one of famous Chagall windows at the Hadassah Hospital in Jerusalem. The page itself provides no context; the tagger can either rely on his/her previous knowledge or try to access the parent page, http://www.md.huji.ac.il/special/chagall/ in order to learn something about the image.