Study support and integration of cultural information resources with Linked Data Tetsuro KAMURA Graduate University for Advanced Studies (SOKENDAI) Tokyo, JAPAN kamura@nii.ac.jp Ikki OHMUKAI National Institute of Informatics Tokyo, JAPAN Toru TAKAHASHI ATR Media Information Science Laboratories Kyoto, JAPAN Hideaki TAKEDA National Institute of Informatics Tokyo, JAPAN Fumihiro KATO National Institute of Informatics Tokyo, JAPAN Hiroshi UEDA ATR-Promotions Inc. Kyoto, JAPAN Abstract—A museum collection search system called Linked Open Data for Academia (LODAC) Museum has been developed that uses Linked Data. The LODAC Museum identifies and associates artists, artworks, and museum information from some different museums to provide integrated data that are published as Linked Data with the SPARQL endpoint. This project's purpose is to provide an information distribution system that can share and publish a wide range of data as Linked Data, especially in the artistic and cultural fields in Japan. Different types of data are currently being integrated, and new approaches and support for studying these fields are being investigated. Museum; Linked Data; Semantic Web; RDF; SPARQL; I. INTRODUCTION In this paper, we introduce a prototype system called “LODAC Museum” to integrate museum information across multiple resources. We identified and associated artists and artwork information from some museum collections with different types of information to provide integrated views of them. Then we investigated the possibility of new approaches and support for studying arts and culture fields. II. PURPOSE Valuable information should be used. To do so, we should establish a cycle of information, i.e., Publish, Share, Collect, Use, and Create. This is crucial in the creative fields, such as the arts and culture. For this purpose, information for re-use needs to be published. Linked Data exactly meets these needs since its purpose is to share data openly by using a re-useable format. Japanese museums maintain and publish information with the individual metadata schema. This leads to difficulty in crossover searching. Therefore, we only obtain fragments of information during a search and need to integrate information from several sources by using Linked Data. In addition, we suppose that new knowledge can be found and new methods discovered by using not only museum collection data, but also different sources, for example, libraries, thesauruses, terminology, and GIS. III. APPROACH A. Method The LODAC Museum is an integrated metadata database of Japanese museum collections. It provides metadata for artworks, creators, and relevant museum information in various RDF formats. The data is now ca. 130,000 from 15 museums, DBpedia Lite Japanese, and GIS data from the National and Regional Planning Bureau. The procedure in the LODAC Museum is as follows: 1) Scraping from Web pages: Collect data from web pages in different sources, identify and extract metadata from each page, and store data with the identified metadata schema. 2) Mapping Vocabularies: Map from the individual metadata schema to the single common schema with the essential elements. 3) Integrating unique items: Identify the same items (artwork, creator, museum) across museum collections and associate them to single identifiers. 4) Publishing: Publish data as RDF with permalinks that work as identifiers for people, artworks, and museum locations, accessible through a SPARQL Endpoint. In this way, a user can use information with string and link data from other sites. B. Development 1) Canonical Data: This data will enable integration of data from different sources. We adopt the “Japan Art Thesaurus” as the canonical data that contain a lot of types of objects, like the creators, artwork titles, museum locations, books, etc. 2) Vocabularies: We do not describe detailed vocabularies for context. Since our purpose is to integrate information from different resources, we provide metadata schema only with the essential elements. These include people’s names, titles, 2011 Second International Conference on Culture and Computing 978-0-7695-4546-2/11 $26.00 © 2011 IEEE DOI 10.1109/Culture-Computing.2011.53 177 2011 Second International Conference on Culture and Computing 978-0-7695-4546-2/11 $26.00 © 2011 IEEE DOI 10.1109/Culture-Computing.2011.53 177