A nonmonotonic logical approach for modelling and revising metabolic networks Oliver Ray Dept. Computer Science University of Bristol Bristol, BS8 1UB, UK oray@cs.bris.ac.uk Ken Whelan Dept. Computer Science University of Aberystwyth Ceredigion, SY23 3DB, UK knw@aber.ac.uk Ross King Dept. Computer Science University of Aberystwyth Ceredigion, SY23 3DB, UK rdk@aber.ac.uk Abstract—This paper describes a new logic-based approach for representing and reasoning about metabolic networks. First it shows how biological pathways can be elegantly represented in a logic programming formalism able to model full chemical reactions with substrates and products in different cell com- partments, and which are catalysed by iso-enzymes or enzyme- complexes that are subject to inhibitory feedbacks. Then it shows how a nonmonotonic reasoning system called XHAIL can be used as a practical method for learning and revising such metabolic networks from observational data. Preliminary results are described in which the approach is validated on a state-of- the-art model of Aromatic Amino Acid biosynthesis. I. I NTRODUCTION Logic programs are a useful formalism for representing and reasoning about metabolic networks. They offer an intuitive relational language for modelling biological knowledge and they are supported by powerful tools for automating various types of inference like deduction (consequence) [6], abduction (explanation) [3], and induction (generalisation) [9]. Moreover, through the “negation as failure” operator, they also provide, as we will show, a practical nonmonotonic reasoning mechanism for revising biological knowledge in the light of new data. This paper proposes a qualitative approach for representing and reasoning about metabolic networks. First it shows how biological pathways can be elegantly represented in a logic programming formalism that is able to model full chemical reactions, whose substrates and products are in different cell compartments, and which are catalysed by iso-enzymes or enzyme-complexes that are subject to inhibitory feedbacks [5]. Then it shows how a nonmonotonic reasoning system called XHAIL (eXtended Hybrid Abductive Inductive Learning) [13] can be used as a practical method for learning and revising such metabolic networks from observational data. Preliminary results are described in which the approach is validated on a realistic model of Aromatic Amino Acid (AAA) biosynthesis in the yeast S. cerevisiae [4]. This initial work shows how XHAIL can simultaneously add or remove metabolic reactions, enzyme inhibitions, and gene-enzyme mappings to or from a metabolic network, in order to better explain the results of observed growth experiments. II. BACKGROUND A. Logic Programs This paper assumes a familiarity with logic programs [6]. An atom a is a predicate p applied to a collection of terms (t 1 ,...,t n ). A literal l is an atom a (positive literal) or the negation of an atom not a (negative literal). A clause C is an expression of the form a : - l 1 ,...,l m where a is an atom (head atom) and the l i are literals (body literals). Informally, a clause is read as an implication which states that the head atom must be true if all of the body literals are true. A clause with no body literals is called a fact and a clause with no head atom is called a constraint. A logic program P is a set of clauses. Informally, a program is read as a conjunction which states that all of its clauses are true. The meaning of a logic program is formalised by the so-called Answer Set semantics [2]. A program P entails a set of ground literals L, denoted P |= L, if there is an answer set of P that satisfies all of the literals l in L. This semantics embodies a form of closed world assumption whereby a negative literal not a is entailed by P if the corresponding positive literal a is not entailed by P . This means |= is a nonmonotonic relation in the sense that adding facts to the left hand side may result in the retraction of facts from the right hand side. A logic program H (hypothesis) entails a set of ground literals E (examples) with respect to a logic program B (background theory) if B ∪ H |= E. The theory H is called an abductive (resp. inductive) hypothesis if it is a set of ground facts (resp. non-ground rules). B. The XHAIL System XHAIL [13] is a nonmonotonic learning system that takes as input a logic program B (background theory) along with a set of ground literals E (examples), and returns as output a logic program H (hypothesis) such that B ∪ H |= E. The hypothesis space is controlled by a set of mode declarations [7] that allow the user to specify which literals may appear in the heads and bodies of hypothesis clauses. A compression heuristic [7] is used to select between competing hypotheses by preferring solutions with fewer literals. Each hypothesis H is obtained by constructing and generalising a preliminary ground theory K, called a Kernel Set, which bounds the search space according to the given bias.