Parameter Estimation with Term-wise Decomposition in Biochemical Network GMA Models by Hybrid Regularized Least Squares-Particle Swarm Optimization Prospero C. Naval, Jr., Luis G. Sison, and Eduardo R. Mendoza Abstract— High-throughput analytical techniques such as nuclear magnetic resonance, protein kinase phosphorylation, and mass spectroscopic methods generate time dense profiles of metabolites or proteins that are replete with structural and kinetic information about the underlying system that produced them. Experimentalists are in urgent need of computational tools that will allow efficient extraction of this information from these time series data. A new parameter estimation method for biochemical systems formulated as Generalized Mass Action (GMA) models known to capture the nonlinear dynamics of complex biological systems such as gene regulatory, signal transduction and metabolic networks, is described. For such models, it is known that param- eter estimation algorithm performance deteriorates rapidly with increasing network size. We propose a decomposition strategy that breaks up the system equations into terms whose rate constants and kinetic order parameters are estimated one term at a time resulting in dramatic parameter space dimension- ality reductions. This approach is demonstrated in a hybrid algorithm based on Regularized Least Squares Regression and Multi-objective Particle Swarm Optimization. We validate our proposed strategy through the efficient and accurate extraction of GMA model parameter values from noise-free and noisy simulated data for Saccharomyces cerevisiae and actual Nuclear Magnetic Resonance (NMR) data for Lactoccocus lactis. I. I NTRODUCTION Biochemical systems parameter estimation has risen to prominence in recent years due to its fundamental role in biological network reconstruction and modeling. The challenge of extracting the parameter values of a nonlinear biochemical model from data becomes even more pressing as high-throughput methods begin to deliver their promise as high resolution tools for biological experimentation. Biochemical Systems Theory (BST) has been advanced as a convenient mathematical framework for modeling, anal- ysis, optimization and manipulation of complex biological systems. BST views a biochemical system as a set of pro- cesses representable as products of power-laws in their inputs whose dynamics can account for all observed biochemical re- sponses. The differential equations describing these processes have two formats in BST: the Generalized Mass Action Prospero C. Naval, Jr. is with the Department of Computer Science, University of the Philippines, Diliman, Quezon City, Philippines (email: pcnaval@dcs.upd.edu.ph). Luis G. Sison is with the Electrical and Electronics Engineering Institute, University of the Philippines, Diliman, Quezon City, Philippines (email: sison@eee.upd.edu.ph). Eduardo R. Mendoza is with the Physics Department and Center for NanoScience, Ludwig Maximillians University, Munich, Germany (email: mendoza@lmu.de). (GMA) and S-System formulations. GMA systems, which include S-systems as special cases, have parameters that map one-to-one onto the network’s topological and regulatory features [34]. A GMA system is described by the following set of coupled differential equations: ˙ X i = k m=1 ±γ im jrim fX j (t) fijm t [t 0 ,t N ] i =1, ..., n where the positive rate constants γ i1 , ..., γ im , ..., γ ik quantify the magnitudes of fluxes of the k production/consumption reactions, and f ij1 , ..., f ijm , ..., f ijk are the kinetic orders describing the inhibitory/activating influence of species j on species i in reaction m. Sets r i1 , ..., r im , ..., r ik represent the indices of the reacting species involved in reaction k. With appropriate algorithms (e.g. [12], [31] , [32]), the parameter values of a biochemical network may be extracted from time course measurements which can be considered as perturbations from some mean state of the network . For certain problems such as metabolic network modeling, the parameter estimation task is simplified since the com- plete interaction structure is known apriori. On the other hand, reconstruction of gene regulatory networks and signal transduction networks are often network inference problems where determination of the interaction structure has to be done in concert with parameter estimation. A variety of BST parameter estimation methods have been proposed in recent years with the majority of authors preferring stochastic search over deterministic techniques. Stochastic search approaches are highly robust while sac- rificing execution speed, in contrast with faster determin- istic methods which however, frequently encounter great difficulty in arriving at suitable solutions for systems with large number of variables [21]. Stochastic search methods include approaches originally devised for discrete-valued problems such as Genetic Algorithms [33], [15], [14], [31], Genetic Programming [5], [16], Memetic Algorithms [30, ], Co-evolutionary Algorithms [17] and for continuous-valued parameter spaces such as Particle Swarm Optimization [31] and Simulated Annealing [12]. Parameter estimation has been achieved with varying levels of success for the following deterministic methods: Nelder-Mead [29], Jacobian Linearization [18], Regression and extensions [20], [6], [36], Branch and Bound [25], Newton-Flow [19], and Constraint Propagation [32]. WCCI 2010 IEEE World Congress on Computational Intelligence July, 18-23, 2010 - CCIB, Barcelona, Spain CEC IEEE 978-1-4244-8126-2/10/$26.00 c 2010 IEEE 3696