GMU C4I Center Technical Report C4I-06-01 ©2004 Kathryn Blackmond Laskey 2/3/06 MEBN: A Logic for Open-World Probabilistic Reasoning Kathryn Blackmond Laskey KLASKEY@GMU.EDU Department of Systems Engineering and Operations Research MS4A6 George Mason University Fairfax, VA 22030, USA Abstract Uncertainty is a fundamental and irreducible aspect of our knowledge about the world. Probability is the most well-understood and widely applied logic for computational scientific reasoning under uncertainty. As theory and practice advance, general-purpose languages are beginning to emerge for which the fundamental logical basis is probability. However, such languages have lacked a logical foundation that fully integrates classical first-order logic with probability theory. This paper presents such an integrated logical foundation. A formal specification is presented for multi-entity Bayesian networks (MEBN), a knowledge representation language based on directed graphical probability models. A proof is given that a probability distribution over interpretations of any consistent, finitely axiomatizable first-order theory can be defined using MEBN. A semantics based on random variables provides a logically coherent foundation for open world reasoning and a means of analyzing tradeoffs between accuracy and computation cost. Furthermore, the underlying Bayesian logic is inherently open, having the ability to absorb new facts about the world, incorporate them into existing theories, and/or modify theories in the light of evidence. Bayesian inference provides both a proof theory for combining prior knowledge with observations, and a learning theory for refining a representation as evidence accrues. The results of this paper provide a logical foundation for the rapidly evolving literature on first-order Bayesian knowledge representation, and point the way toward Bayesian languages suitable for general-purpose knowledge representation and computing. Because first-order Bayesian logic contains classical first-order logic as a deterministic subset, it is a natural candidate as a universal representation for integrating domain ontologies expressed in languages based on classical first-order logic or subsets thereof. Keywords: Bayesian networks, Bayesian learning, graphical probability models, knowledge representation, multi-entity Bayesian network, random variable, probabilistic ontology 1 Introduction First-order logic is primary among logical systems from both a theoretical and a practical standpoint. It has been proposed as a unifying logical foundation for defining extended logics and interchanging knowledge among applications written in different languages. However, its applicability has been limited by the lack of a coherent semantics for plausible reasoning. A theory in first-order logic assigns definite truth-values only to sentences that have the same truth- value (either true or false) in all interpretations of the theory. The most that can be said about any other sentence is that its truth-value is indeterminate. A reasoner that requires logical proof before it can draw conclusions is inadequate for many practical applications. This problem has been addressed with a proliferation of plausible reasoning logics, but these have lacked firm theoretical grounding. The need for plausible reasoning is especially acute for the problem of knowledge interchange. Different applications have different ontologies, different semantics, and different knowledge and data stores. Legacy applications are usually only partially documented, and may rely on tacit usage conventions that even proficient users do not fully understand or appreciate. Even if these problems could be circumvented and a full formal specification for each application