OpenAIRE LOD services: Scholarly Communication Data as Linked Data Giorgos Alexiou 1, 2 , Sahar Vahdati 3 , Christoph Lange 3, 4 , George Papastefanatos 1 , Steffen Lohmann 3, 4 1 Institute for the Management of Information Systems, Athena Research Center, Greece 2 School of Electrical and Computer Enginnering, National Technical University of Athens, Greece 3 Enterprise Information Systems (EIS) department, University of Bonn, Germany 4 Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Sankt Augustin, Germany galexiou@imis.athena-innovation.gr, vahdati@uni-bonn.de, langec@cs.uni- bonn.de, gpapas@imis.athena-innovation.gr, steffen.lohmann@iais.fraunhofer.de Abstract. OpenAIRE, the Open Access Infrastructure for Research in Europe, enables search, discovery and monitoring of the publications and datasets from 100,000+ research projects. Increasing the reusability of the OpenAIRE research metadata, connecting them to other open data about projects, publications, people and organizations, and reaching out to further related domains requires better technical interoperability, which we aim at achieving by exposing the OpenAIRE Information Space as Linked Data. We present a scalable and maintainable architecture that converts the OpenAIRE data from its original HBase NoSQL source to RDF. We furthermore explore how this novel integration of data about research can facilitate scholarly communication. 1 Introduction OpenAIRE (OA) 1 is the European Union's flagship project for an Open Access Infrastructure for Research; it enables search, discovery and monitoring of scientific outputs (more than 13M publications, 12M authors and scientific datasets), harvested from over 6K data providers and linked to more than 100K research projects funded by EU and Australian bodies. To increase the interoperability of the OA Information Space (IS), we have published its data as Linked Open Data (LOD). In our previous work [5], we have specified a vocabulary for the OA LOD and experimented with different implementations of publishing the OA IS as LOD. Based on this preliminary work, we have developed and now present a scalable implementation over Hadoop that can efficiently address the publishing of large volumes of scholarly data, through which OA can offer three different LOD services: i. fine-grained exploration of data records about individual entities in the OA IS, ii. a downloadable all-in-one data dump, and iii. interactive querying via a SPARQL endpoint, i.e., a standardized query interface. On top of this setup we can add further services, e.g., for visual exploration or data analysis, and proceed with linking the OA data to related datasets. OpenAIRE LOD services: Scholarly Communication Data as Linked Data file:///D:/git/github.com/EIS-Bonn/Papers/2016/SAVE-SD-Interlinkin... 1 of 6 2017-01-20 23:02