HISTORIAE, History of Socio-Cultural
Transformation as Linguistic Data Science.
A Humanities Use Case
Florentina Armaselu
1
Centre for Contemporary and Digital History (C
2
DH), University of Luxembourg, Luxembourg
Elena-Simona Apostol
Department of Computer Science and Engineering, Faculty of Automatic Control and Computer,
University Politehnica of Bucharest, Romania
Anas Fahad Khan
Institute for Computational Linguistics «A. Zampolliż,
National Research Council of Italy, Pisa, Italy
Chaya Liebeskind
Department of Computer Science, Jerusalem College of Technology, Israel
Barbara McGillivray
Theoretical and Applied Linguistics, Faculty of Modern and Medieval Languages and Linguistics,
University of Cambridge, UK
The Alan Turing Institute, London, UK
Ciprian-Octavian Truică
Department of Computer Science and Engineering, Faculty of Automatic Control and Computer,
University Politehnica of Bucharest, Romania
Giedr ˙ e Val¯ unait ˙ e Oleškevičien˙ e
Institute of Humanities, Mykolas Romeris University, Vilnius, Lietuva
Abstract
The paper proposes an interdisciplinary approach including methods from disciplines such as history of
concepts, linguistics, natural language processing (NLP) and Semantic Web, to create a comparative
framework for detecting semantic change in multilingual historical corpora and generating diachronic
ontologies as linguistic linked open data (LLOD). Initiated as a use case (UC4.2.1) within the COST
Action Nexus Linguarum, European network for Web-centred linguistic data science, the study will
explore emerging trends in knowledge extraction, analysis and representation from linguistic data
science, and apply the devised methodology to datasets in the humanities to trace the evolution
of concepts from the domain of socio-cultural transformation. The paper will describe the main
elements of the methodological framework and preliminary planning of the intended workĆow.
2012 ACM Subject Classification Computing methodologies → Semantic networks; Computing meth-
odologies → Ontology engineering; Computing methodologies → Temporal reasoning; Computing
methodologies → Lexical semantics; Computing methodologies → Language resources; Computing
methodologies → Information extraction
Keywords and phrases linguistic linked open data, natural language processing, semantic change,
diachronic ontologies, digital humanities
Digital Object Identifier 10.4230/OASIcs.LDK.2021.34
Author Contributions F.A., Sections 1, 2.1, 2.2, 2.5, 2.6, 2.7, 3; E.S.A., Section 2.4; A.F.K., Section
2.3; C.L., Section 1.3; B.M., Sections 1.3, 2.4; C.O.T., Section 2.4; G.V.O., Sections 1.3, 1.4. All the
authors critically revised and approved the Ąnal version submitted to the LDK 2021 proceedings.
1
Ćorentina.armaselu@uni.lu
© Florentina Armaselu, Elena-Simona Apostol, Anas Fahad Khan, Chaya Liebeskind, Barbara
McGillivray, Ciprian-Octavian Truică, and Giedr˙ e Val¯ unait˙ e Oleškevičien˙ e;
licensed under Creative Commons License CC-BY 4.0
3rd Conference on Language, Data and Knowledge (LDK 2021).
Editors: Dagmar Gromann, Gilles Sérasset, Thierry Declerck, John P. McCrae, Jorge Gracia, Julia Bosque-Gil,
Fernando Bobillo, and Barbara Heinisch; Article No. 34; pp. 34:1Ű34:13
OpenAccess Series in Informatics
Schloss Dagstuhl Ű Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany