Towards Budget Comparative Analysis: The Need for Fiscal Code Lists as Linked Data Panagiotis-Marios Filippidis Open Knowledge Greece and Aristotle University of Thessaloniki School of Journalism and Mass Communications Thessaloniki, Greece paﬁlipp@jour.auth.gr Sotirios Karampatakis Open Knowledge Greece and Aristotle University of Thessaloniki School of Mathematics Thessaloniki, Greece sokaramp@auth.gr Lazaros Ioannidis Open Knowledge Greece and Aristotle University of Thessaloniki School of Mathematics Thessaloniki, Greece larjohn@math.auth.gr Jindˇ rich Mynarz Department of Information and Knowledge Engineering, University of Economics W. Churchill Sq. 4 130 67 Prague, Czech Republic jindrich.mynarz@vse.cz Vojt ˇ ech Svátek Department of Information and Knowledge Engineering, University of Economics W. Churchill Sq. 4 130 67 Prague, Czech Republic svatek@vse.cz Charalampos Bratsas Open Knowledge Greece and Aristotle University of Thessaloniki School of Mathematics Thessaloniki, Greece cbratsas@math.auth.gr ABSTRACT Code lists are a key part of budget datasets as they serve for the coding of ﬁscal concepts within them. However, the great diversity of classiﬁcations across countries and con- cepts does not allow to presume upon their actual value, as dimension properties. In this paper we discuss the need for creating code lists Linked Data for the classiﬁcations used in ﬁscal datasets, in three basic steps. First, code lists have to be extracted from ﬁscal datasets, especially if there are no relevant metadata in the budget description, which could easily identify them. Next, code lists from diﬀerent datasets or sources have to be represented in the same way, with SKOS vocabulary, thus they can be linked with each other. Finally, linking of similar code lists will also allow the link- ing of the containing datasets, increasing their data analysis and knowledge extraction possibilities. CCS Concepts •Information systems → Resource Description Frame- work (RDF); Ontologies; Extraction, transformation and loading; Hierarchical data models; Information re- trieval diversity; Keywords Linked Data, SKOS, Knowledge Extraction 1. INTRODUCTION Budget datasets contain detailed information about the ways public money is spent to the functions of the gov- ernment. They include ﬁelds that refer to ﬁscal and other c  2016 Copyright held by the author/owner(s). SEMANTICS 2016: Posters and Demos Track September 13-14, 2016, Leipzig, Germany budget related concepts. Many of these ﬁelds have a spe- ciﬁc range of values. To this end, statistic agencies across Europe have created appropriate code lists, which are pre- scribed controlled vocabularies that contain all the values a speciﬁc ﬁeld can get. Code lists are an essential part of a budget dataset as they serve for the coding of concepts that can be written or described in many ways. The hierarchical structure of the majority of the code lists allows the hierarchy of concepts related to budget, thus they may support aggregated views over data, for example, over a particular expenditure cate- gory, or a municipality administration oﬃce. Additionally, standardized code lists are a key device to make ﬁscal data comparable. This information can be shared across several datasets and be interlinked to external data too, allowing comparisons between budgets of diﬀerent years and diﬀer- ent organizations as well, as they use the same codes for concepts that they would otherwise describe in a diﬀerent way. International authorities propose many generic code lists that have been fully or partially adapted to the national budget representation of European countries. The most common case of diﬀerentiation is when a national code list contain one or two additional levels of detail in relation to the international classiﬁcation, according to the respective needs of the country. Furthermore, European countries have also established and use their own code lists. That leads to a variety of classiﬁcations for the same concepts, while they may use code lists in diﬀerent ﬁelds of the budget, as well. However, the use of oﬃcially proposed code lists in bud- gets is rather limited so far. We noticed that in many cases, the classiﬁcations used in budget datasets diﬀer from both international and national proposed code lists and there are no relevant metadata about them. Thus, the ambiguous use of code lists by countries and municipalities is a very com-