Proposed metrics for data
accessibility in the context
of linked open data
Mahdi Zahedi Nooghabi and Akram Fathian Dastgerdi
Ferdowsi University of Mashhad, Mashhad, Iran
Abstract
Purpose – One of the most important categories in linked open data (LOD) quality models is
“data accessibility.” The purpose of this paper is to propose some metrics and indicators for assessing
data accessibility in LOD and the semantic web context.
Design/methodology/approach – In this paper, at first the authors consider some data quality and
LOD quality models to review proposed subcategories for data accessibility dimension in related texts.
Then, based on goal question metric (GQM) approach, the authors specify the project goals, main
issues and some questions. Finally, the authors propose some metrics for assessing the data
accessibility in the context of the semantic web.
Findings – Based on GQM approach, the authors determined three main issues for data accessibility,
including data availability, data performance, and data security policy. Then the authors created four
main questions related to these issues. As a conclusion, the authors proposed 27 metrics for measuring
these questions.
Originality/value – Nowadays, one of the main challenges regarding data quality is the lack of
agreement on widespread quality metrics and practical instruments for evaluating quality.
Accessibility is an important aspect of data quality. However, few researches have been done to
provide metrics and indicators for assessing data accessibility in the context of the semantic web. So, in
this research, the authors consider the data accessibility dimension and propose a comparatively
comprehensive set of metrics.
Keywords Linked open data, Accessibility, Data quality, Semantic web, Data accessibility metrics,
GQM approach
Paper type Research paper
Data accessibility metrics in the context of linked open data (LOD)
Data quality is a multi-dimensional concept (Pipino et al., 2002), for which various
definitions have been provided. In general, the definitions of data (or information)
quality take either an intrinsic or a contextual view of information. In the intrinsic view,
information properties are largely defined independent of a specific user, task, or
application. In the context-based view, information quality is primarily defined relative
to the user, the task, and the application of the information. Moreover, the literature
adds a representational dimension to the current views. The representational
dimension addresses the extent to which information presentation effectively facilitates
interpretation and understanding (Nelson et al., 2005). Overall, the mostly cited
definition for data quality in literature is fitness for use (Wang and Strong, 1996; Strong
et al., 1997; Wang, 1998; Tayi and Ballou, 1998; Veregin, 1999). This definition
emphasizes the importance of user judgment for accepting or rejecting data based on
the usability of the data provided. In general, the idea of data or information quality
depends on the actual use of data, the design of the system, and the production
processes involved in data generation (Wand and Wang, 1996).
To better understand what we mean by data quality, we first need to define data.
Data are the primary base of information that describes real world objects in a format
Program
Vol. 50 No. 2, 2016
pp. 184-194
© Emerald Group Publishing Limited
0033-0337
DOI 10.1108/PROG-01-2015-0007
Received 26 January 2015
Revised 29 October 2015
Accepted 7 December 2015
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0033-0337.htm
184
PROG
50,2