Theoretical Knowledge Graph Reasoning via Ending Anchored Rules Canlin Zhang ,§, , Yannis Katsis , Yoshiki Vazquez-Baeza , Andrew Bartko Ho-Cheol Kim , Chun-Nan Hsu University of California, San Diego IBM Research-Almaden § Contact via: caz004@ucsd.edu Abstract Discovering precise and specific rules from knowledge graphs is regarded as an essen- tial challenge, which can improve the perfor- mances of many downstream tasks and even provide new ways to approach some Natural Language Processing research topics. In this paper, we provide a fundamental theory for knowledge graph reasoning based on ending anchored rules. Our theory provides precise reasons answering why or why not a triple is correct. Then, we implement our theory by what we called the EARDict model. Re- sults show that the EARDict model achieves new state-of-the-art performances on bench- mark knowledge graph completion tasks, in- cluding a Hits@10 score of 80.38 percent on WN18RR. 1 Introduction A knowledge graph (KG) is a graphical represen- tation of the knowledge base (KB), in which en- tities are represented by nodes and relations are represented by links among nodes. Knowledge graphs are useful tools in many Natural Language Processing (NLP) research areas, such as question answering (Zhang et al., 2016; Huang et al., 2019; Diefenbach et al., 2018), semantic parsing (Yih et al., 2015; Berant et al., 2013) and dialogue sys- tems (He et al., 2017; Keizer et al., 2017). However, most knowledge graphs suffer from missing rela- tions (Socher et al., 2013; West et al., 2014), which leads to the task of knowledge graph completion or link prediction. The task aims at recovering missing relations in a KG given the known ones. In general, there are two approaches to knowl- edge graph completion: Embedding-based ap- proach (Bordes et al., 2013a; Dettmers et al., Work done during a Postdoc in UC San Diego funded by IBM Research. 2018a; Nguyen et al., 2018) and rule-based ap- proach (Yang et al., 2017; Das et al., 2018; Pin- ter and Eisenstein, 2018). Generally speaking, embedding-based models represent entities and relations as real-valued vectors (Mikolov et al., 2013). Then, deep neural networks are trained in an end-to-end manner (Zhou and Xu, 2015) based on these vectorized embeddings to capture the semantic information in a knowledge graph. Al- though being flexible and expressive (Socher et al., 2013), embedding-based models cannot provide definite explanations behind their link prediction re- sults (Glasmachers, 2017; Ormazabal et al., 2019). In contrast, rule-based models attempt to capture regularities as rules that presented in a knowledge graph. Link prediction results by these models can be explained by how these rules are followed (Yang et al., 2017; Das et al., 2018). However, rule-based models suffer from scalability issues and lack of ex- pressive power (Teru et al., 2020). That is, the dis- covered rules can only explain the relations among a small portion of entities. In this paper, we present a rule-based model using the ending anchored rules, which resolves the scalability issues and can explain the relations among a broad range of entities. Our contribution is two-fold: (i) We establish a fundamental theory for knowledge graph reasoning, based on which any knowledge graph can be completed in a reliable way. (ii) We come up with the EARDict model, an efficient implementation of our theory, which resolves the scalability issues of traditional rule- based models by a strong expressive power. In Section 2, we shall introduce our knowledge graph reasoning theory in details. Then in Section 3, we shall describe how to implement our the- ory by the EARDict model and present the experi- mental results on two large benchmark datasets of knowledge graph completion. Section 4 discusses the results and tricky issues involved in our theory arXiv:2011.06174v2 [cs.CL] 19 Nov 2020