Feature Engineering and Explainability with Vadalog: A Recommender Systems Application Jack Clearman 1 , Ruslan R. Fayzrakhmanov 2 , Georg Gottlob 2,3 and Yavor Nenov 2 , St´ ephane Reissfelder 2 , Emanuel Sallinger 2,3 , and Evgeny Sherkhonov 2 1 Meltwater Group 2 University of Oxford 3 TU Wien 1 Introduction Vadalog [2] is an extension of Datalog that features existential rules and a rich set of functions, libraries, and methods for connecting to external data sources, which make it a powerful tool for building advanced industrial AI applications [1]. Vadalog forms the core of an ongoing research collaboration between the University of Oxford and the media intelligence company Meltwater, that aims at a recommender system for the most relevant insights about companies from outside data, including Meltwater’s repository of millions of news articles. In this application paper, we demonstrate various aspects of such a recommender system in the movies domain, and show how Vadalog can be used for feature engineering and the computation of explainable recommendations. Recommender Systems assist users in choosing the most relevant items they may be interested in, thus reducing the experienced information load. The typ- ical methods used in recommender systems are based on the analysis of items the user has already selected and are usually limited to “low-level” features, i.e., metadata associated with an item. However, such methods are not able to pro- vide suitable recommendations in the absence of discriminative low-level features or the presence of non-trivial combinations of features which capture discrepancy between liked and disliked items. In this paper, we approach this problem by building a new set of high-level features that can capture domain knowledge and non-trivial factors that inﬂuence user’s decision in choosing movies. Vadalog is well suited for computing such high-level features, by having support for: (1) aggregation, for computing features such as total revenue of movies, (2) graph traversal for computing properties on the co-starring graph, (3) integration of various data sources for uniﬁed access to multiple sources, such as IMDB and RottenTomatoes, and (4) existential rules used in the computation of recommen- dations for new users. Furthermore, declarativeness of Vadalog allows developing high-level features rapidly (usually, a few hours per feature from conception to deployment) and easily maintaining the resulting programs. Finally, we demon- strate how to build an explainable ranking of recommendations, thus allowing Vadalog not only to provide explanations at reasoning level, i.e., why a partic- ular high-level feature has a certain value, but also explanation at the machine learning level, i.e., why there is a particular ranking of items.