Weaving Enterprise Knowledge Graphs: The Case of Company Ownership Graphs * Paolo Atzeni Università Roma Tre Luigi Bellomarini Banca d’Italia Michela Iezzi Banca d’Italia Emanuel Sallinger TU Wien and University of Oxford Adriano Vlad Università Roma Tre and University of Oxford ABSTRACT Motivated by our experience in building the Enterprise Knowl- edge Graph of Italian companies for the Central Bank of Italy, in this paper we present an in-depth case analysis of company own- ership graphs, graphs having company ownership as a central concept. In particular, we study and introduce three industrially relevant problems related to such graphs: company control, asset eligibility and detection of personal links. We formally charac- terize the problems and present V ada-Link, a framework based on state-of-the-art approaches for knowledge representation and reasoning. With our methodology and system, we solve the prob- lems at hand in a scalable, model-independent and generalizable way. We illustrate the favourable architectural properties of Vada- Link and give experimental evaluation of the approach. 1 INTRODUCTION This paper is motivated by our experience in building the Enter- prise Knowledge Graph of Italian companies for Banca d’Italia, the central bank of Italy. Company ownership graphs are central objects in corporate economics [9, 19, 23, 36] and are of high importance for central banks, fnancial authorities and national statistical ofces, to solve relevant problems in diferent areas: banking supervision, credit- worthiness evaluation, anti-money laundering, insurance fraud detection, economic and statistical research and many more. As shown in Figure 1, in such graphs, ownership is the core concept: nodes are companies and persons (black resp. blue nodes), and ownership edges (black solid links) are labelled with the fraction of shares that a company or person x owns of a company y. Company graphs are helpful in many situations. One frst important problem that can be solved with such graphs is company control (also efectively formalized in the con- text of logic programming [18]), which amounts to deciding whether a company x controls a company y, that is, x can push decisions through in y having the vote majority. Consider the graph in Figure 1: P 1 controls C , D (via C ), E (since it controls D, which owns 40% of E and P 1 directly owns 20% of it), and F (via E and D). Similarly, P 2 controls all its descendants except for L. Apparently, P 1 exerts no control on L either. The views and opinions expressed in this paper are those of the authors and do not necessarily refect the ofcial policy or position of Banca d’Italia. This work is supported by the EPSRC grant EP/M025268/1, the Vienna Science and Technology Fund (WWTF) grant VRG18-013, and the EC grant 809965. © 2020 Copyright held by the owner/author(s). Published in Proceedings of the 23rd International Conference on Extending Database Technology (EDBT), March 30-April 2, 2020, ISBN 978-3-89318-083-7 on OpenProceedings.org. Distribution of this paper is permitted under the terms of the Creative Commons license CC-by-nc-nd 4.0. Figure 1: Sample excerpt of a company ownership KG. A second particularly representative application of company graphs is in the context of collateral eligibility (also known as as- set eligibility or close link) problem, which consists of estimating the risk to grant a specifc loan to a company x that is backed by collateral issued by another company y. According to Euro- pean Central Bank regulations [1], company y cannot act as a guarantor for x if it is too łclosež to it in terms of ownership; the regulation gives a detailed defnition for this concept of closely- linked entity, which includes: łthe two companies must not be owned by a common third party entity, which owns more than 20% of bothž. With respect to Figure 1, we see that, for example, G and I are closely linked since P 2 owns more than 20% of both. 1 Besides fnancial relationships, personal or family connections enable much broader use of such company graphs: detecting family businesses or studying the real dispersion of control [21] are just two such applications. In our example in Figure 1, know- ing that P 1 and P 2 have personal connections śe.g., are marriedś allows to deduce that, in fact, P 1 and P 2 together control L. Likely, they act as a single center of interest: L is in fact a family busi- ness, with control in the hands of a single family, with P 1 and P 2 together controlling 60% of it. Similarly, although D and G do not strictly fulfl the defnition of close link, as P 1 and P 2 have a personal connection, there is very low risk diferentiation be- tween them and so it is reasonable to prevent G from acting as a guarantor for D or vice versa. In all the above settings, and indeed in many more, it is our experience that the links representing relevant relationships in the fnancial realm are not immediately available in data stores. For instance, there are no enterprise graphs readily providing company control relationships, close links or family connections. Reasons fall mostly into four categories: (i) such links represent non-trivial relationships, whose calculation is complex and is not 1 Actually, here we are forcing the concept of close links to include individuals. Industry and Applications Paper Series ISSN: 2367-2005 555 10.5441/002/edbt.2020.66