Using a SAT-based Model Finder to Verify First-Order Logic Ontologies of Space against Datasets: Scalability and Bottlenecks in Practice Shirly Stephen and Torsten Hahmann School of Computing and Information Science University of Maine, Orono, ME 04469 Abstract Semantic veriﬁcation of an ontology encompasses checking its internal consistency, which ensures the ontology itself is free of contradictions, and its external consistency, which en- sures that it is consistent with the kind of datasets it is in- tended to work with. Unlike description logic (DL) ontolo- gies, ﬁrst-order logic (FOL) ontologies are rarely veriﬁed externally against datasets because FOL model ﬁnders have been limited to tiny models with less than 20 individuals. Focusing on FOL ontologies of spatial relations, this paper investigates the source of the bottleneck via a formal analysis and experiments with the SAT-based model ﬁnder Paradox. We demonstrate that the presence of many deﬁned terms of highest arity (here deﬁned binary spatial relations), which are common in ontologies but not in other mathematical theo- ries, signiﬁcantly slows down model ﬁnding. It is shown that removing optional explicit deﬁnitions and substituting the terms they deﬁne with their deﬁniens exponentially speeds up model ﬁnding, allowing easy construction of models with more than 100 individuals. While this is still small compared to what DL reasoners can handle, it is a signiﬁcant improve- ment over the tiny, often trivial models to which veriﬁcation of FOL ontologies has been traditionally restricted. 1 Introduction Formal ontologies capture a domain’s terminology and its semantics in a logic-based language as a means to automati- cally reason about the domain. Such ontologies may provide the background knowledge necessary to interpret a dataset collected in the domain, to semantically integrate different datasets or applications, or to make implicit assumptions in the domain explicitly provable. However, an ontology can only serve its purpose if we know that it correctly and adequately captures the modeled domain. Ontology eval- uation comprises techniques and measures of ontologies’ correctness and adequacy, wherein ontology veriﬁcation aims to verify an ontology’s correctness (Vrandeˇ ci´ c 2009; G´ omez-P´ erez 2004) and ontology validation its adequacy (Obrst et al. 2007; G ´ omez-P´ erez 2004). While ontology vali- dation requires signiﬁcant human intervention, many aspects of ontology veriﬁcation beneﬁt from extensive automation (Gr¨ uninger et al. 2010), including checking an ontology’s Copyright c  2018, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved. logical consistency. This comes in two forms: (1) checking an ontology’s internal consistency that rules out contradict- ing states by generating some model, and (2) checking its external consistency with datasets that are representative of the ontology’s intended domain or application. While description logic (DL) ontologies – including OWL ontologies – can be efﬁciently veriﬁed internally and exter- nally even with large datasets (i.e., a large ABox), ﬁrst-order logic (FOL) ontologies are often only internally veriﬁed. One reason for this is that FOL ontologies often exclusively formalize the structure of domain (i.e., the terms and how they can be interpreted) and rarely contain facts/data points about individuals (i.e., they typically lack an ABox). But more importantly, model ﬁnding for FOL ontologies is not only theoretically incomplete but has also not been very suc- cessful in practice either except for tiny, often trivial models. This paper examines this assumption more closely and iden- tiﬁes speciﬁc bottlenecks that can be remedied for model ﬁnding to scale better in practice despite its theoretical un- decidability and intractability. Model ﬁnding for FOL ontologies typically utilizes FOL model ﬁnders which are often lumped together with other automated theorem provers (ATP) but which focus on prov- ing satisﬁability rather than unsatisﬁability of a logical the- ory. However, the available model ﬁnders, such as the SAT- based model ﬁnders Paradox (Claessen and S ¨ orensson 2003) or Mace4 (McCune 2003), are mostly tested on relatively small axiomatizations with few nonlogical symbols (i.e., predicate and function symbols), as commonly found in mathematical conjectures. To search for models, SAT-based model ﬁnders convert a FOL theory into Clausal Normal Form (CNF) and then instantiate it with (an increasing num- ber of) individuals to produce a series of propositional satis- ﬁability (SAT) problems, whose size (as measured in num- ber of propositional variables and clauses) grows exponen- tially with the number of individuals in a model and the size (number and arity of predicates) of the ontology’s terminol- ogy. Objectives This paper examines to what extent model ﬁnding for FOL ontologies, with and without data, is fea- sible in practice. Our theoretical analysis reveals that a ma- jor source of intractability is the number and arity of terms. But unlike other mathematical theories, many terms are ex- 1