Using a SAT-based Model Finder to Verify First-Order Logic Ontologies of Space against Datasets: Scalability and Bottlenecks in Practice Shirly Stephen and Torsten Hahmann School of Computing and Information Science University of Maine, Orono, ME 04469 Abstract Semantic verification of an ontology encompasses checking its internal consistency, which ensures the ontology itself is free of contradictions, and its external consistency, which en- sures that it is consistent with the kind of datasets it is in- tended to work with. Unlike description logic (DL) ontolo- gies, first-order logic (FOL) ontologies are rarely verified externally against datasets because FOL model finders have been limited to tiny models with less than 20 individuals. Focusing on FOL ontologies of spatial relations, this paper investigates the source of the bottleneck via a formal analysis and experiments with the SAT-based model finder Paradox. We demonstrate that the presence of many defined terms of highest arity (here defined binary spatial relations), which are common in ontologies but not in other mathematical theo- ries, significantly slows down model finding. It is shown that removing optional explicit definitions and substituting the terms they define with their definiens exponentially speeds up model finding, allowing easy construction of models with more than 100 individuals. While this is still small compared to what DL reasoners can handle, it is a significant improve- ment over the tiny, often trivial models to which verification of FOL ontologies has been traditionally restricted. 1 Introduction Formal ontologies capture a domain’s terminology and its semantics in a logic-based language as a means to automati- cally reason about the domain. Such ontologies may provide the background knowledge necessary to interpret a dataset collected in the domain, to semantically integrate different datasets or applications, or to make implicit assumptions in the domain explicitly provable. However, an ontology can only serve its purpose if we know that it correctly and adequately captures the modeled domain. Ontology eval- uation comprises techniques and measures of ontologies’ correctness and adequacy, wherein ontology verification aims to verify an ontology’s correctness (Vrandeˇ ci´ c 2009; omez-P´ erez 2004) and ontology validation its adequacy (Obrst et al. 2007; G ´ omez-P´ erez 2004). While ontology vali- dation requires significant human intervention, many aspects of ontology verification benefit from extensive automation (Gr¨ uninger et al. 2010), including checking an ontology’s Copyright c 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. logical consistency. This comes in two forms: (1) checking an ontology’s internal consistency that rules out contradict- ing states by generating some model, and (2) checking its external consistency with datasets that are representative of the ontology’s intended domain or application. While description logic (DL) ontologies – including OWL ontologies – can be efficiently verified internally and exter- nally even with large datasets (i.e., a large ABox), first-order logic (FOL) ontologies are often only internally verified. One reason for this is that FOL ontologies often exclusively formalize the structure of domain (i.e., the terms and how they can be interpreted) and rarely contain facts/data points about individuals (i.e., they typically lack an ABox). But more importantly, model finding for FOL ontologies is not only theoretically incomplete but has also not been very suc- cessful in practice either except for tiny, often trivial models. This paper examines this assumption more closely and iden- tifies specific bottlenecks that can be remedied for model finding to scale better in practice despite its theoretical un- decidability and intractability. Model finding for FOL ontologies typically utilizes FOL model finders which are often lumped together with other automated theorem provers (ATP) but which focus on prov- ing satisfiability rather than unsatisfiability of a logical the- ory. However, the available model finders, such as the SAT- based model finders Paradox (Claessen and S ¨ orensson 2003) or Mace4 (McCune 2003), are mostly tested on relatively small axiomatizations with few nonlogical symbols (i.e., predicate and function symbols), as commonly found in mathematical conjectures. To search for models, SAT-based model finders convert a FOL theory into Clausal Normal Form (CNF) and then instantiate it with (an increasing num- ber of) individuals to produce a series of propositional satis- fiability (SAT) problems, whose size (as measured in num- ber of propositional variables and clauses) grows exponen- tially with the number of individuals in a model and the size (number and arity of predicates) of the ontology’s terminol- ogy. Objectives This paper examines to what extent model finding for FOL ontologies, with and without data, is fea- sible in practice. Our theoretical analysis reveals that a ma- jor source of intractability is the number and arity of terms. But unlike other mathematical theories, many terms are ex- 1