Inference of analytical thermodynamic models for biological networks Eliodoro Chiavazzo, 1, Matteo Fasano, 1 and Pietro Asinari 1 1 Energy Department, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy (Dated: November 13, 2012) We present an automated algorithm for inferring analytical models of closed reactive biochemical mixtures, on the basis of standard approaches borrowed from thermodynamics and kinetic theory of gases. As an input, the method requires a number of steady states (i.e. an equilibria cloud in the phase-space), and at least one time series of measurements for each species. Validations are discussed for both the Michaelis-Menten mechanism (four species, two conservation laws) and the mitogen-activated protein kinase - MAPK - mechanism (eleven species, three conservation laws). PACS numbers: 05.20.Dd, 82.39.-k, 82.20.Wt INTRODUCTION The reverse engineering of biological networks from experimental observations has recently gained an increasing attention, owing to remarkable advancements in modern high-throughput techniques for the generation of time series data on metabolites, genes and other components of biological relevance [1]. However, due to high dimensionality, the latter still remains a demanding task that often requires a priori knowledge on the system structure. Predictive mathematical models are highly desirable, for instance, for the external control of cellular functions, and this has motivated an intense effort in such a direction [2, 3]. In this work, we intend to investigate the ability of some classical thermodynamic approaches for the automatic prediction of equilibria, dynamical behavior and system structure. The main advantage of such an approach is that it does not require prior knowledge on the underlying biochemical mechanism, and it is solely based on measurements of species concentrations in closed systems. We focus on biological systems formed by several species interacting according to a web of (bio)chemical reactions in closed systems under fixed temperature T and volume V . We further assume that dissipation is ensured by the existence of a global Lyapunov function G, which is typically linked to a thermodynamic potential, and a unique steady state (equilibrium) is reached after a sufficiently long time. Let the concentration of n species evolve in time according to an autonomous system of ordinary differential equations (ODEs): ˙ x = dx dt = f (x) , (1) with x =[x 1 , ..., x n ] T defining the system state (e.g. in terms of molar concentrations x i ). Let x eq and G(x) be the unique equilibrium state of the ODEs (1) and its global Lyapunov function, respectively. Hence, at all instant t, the time derivative of G is non-positive, ˙ G = Gf 0, and it vanishes at steady state: ˙ G (T,V,x eq ) = 0. Time dynamics (1) is often characterized by linear constraints (e.g. due to conservation of the mole number of elements forming the chemical species). Thus, assuming the presence of r conserved quantities, there exists a fixed (r × n) matrix M such that, at all time instants t: Mx (t)= C, (2) with C being an rcomponent column of fixed quantities (conserved moieties). SEARCHING FOR CONSERVATION LAWS Neither the number nor the expressions of the conservation laws (2) are typically known when investigating on a new biological phenomenon, unless pre-existing knowledge on the reaction stoichiometry is available. For addressing the above issues, the suggested approach is based upon the analysis of a collection of scattered steady states (experimental equilibrium cloud), and at least one time series of species concentrations evolving from an arbitrary initial state. In this work, we perform inspection of the equilibrium cloud by means of principal component analysis - PCA - [4] in order to estimate the cloud dimension which, as discussed below, indicates the number of conservation laws. For the sake of completeness, it is worth stressing that more recent non-linear techniques, such as diffusion maps [5], may be also adopted for estimating the dimension r.