Investigating the genomic basis of metabolic robustness through in silico flux analysis Marcin Imielinski, Niels Klitgord, Calin Belta Abstract— We employ a novel implementation of flux balance analysis to investigate the role of genome structure in the maintenance of metabolic robustness. We propose the hypoth- esis that the genomic organization of a bacterium buffers its metabolome against random gene deletion. To test this hypothesis, we use a novel implementation of producibility analysis to determine the metabolomic impact of gene deletions in the E. coli iJR904 genome-scale metabolic model. From these results, we determine metabolomic fragility, which we compute as the average number of metabolites knocked out across all gene deletions of a given size in a given nutrient media. We apply this analysis for three deletion window sizes (4000, 8000, 16000bp) across the length of the E. coli genome. We compare these results to those obtained from several null distributions of permuted genomes to assess the impact of E. coli genome organization on its metabolic robustness. Our results strongly suggest that the arrangement of genes on the E. coli genome buffers metabolite producibility against random gene deletion. Our results have interesting implications for the understanding of metabolic network evolution. Future work includes examining our hypothesis for a wider range of deletion sizes and nutrient environments and extending our results to the metabolic networks of other species. I. INTRODUCTION The metabolic network is the biochemical machinery with which a cell transforms a limited set of nutrients in its environment into the multitude of molecules required for growth and survival. It consists of hundred to thousands of small molecule species intricately linked by an even larger set of biochemical reactions. The expansive and highly connected nature of this important cellular system greatly limits the degree of insight that may be gained from the isolated study of a single component or module. The first step towards systems-level understanding of metabolism is the construction of a model that captures what is known regarding an organism’s small molecule biochemistry and its underlying genetics. The advent of sequencing technology combined with general improvements in the organization of biological information [7], [10] has allowed the building of such genome-scale metabolic models for numerous microbial organisms, including E. coli, S. cereviseae, H. pylori, and S. aureus [13], [14], [4], [5], [11], [12], [8]. Genome-scale metabolic modeling enables the in silico study of the relationships of biological components and systems-level functions. It also allows for the examination of global features of biological systems that may not be evident M. Imielinski is with the University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA imielins@mail.med.upenn.edu N. Klitgord and C. Belta are with the Boston University Graduate Program in Bioinformatics, Boston, MA 02115, USA {niels,cbelta}@bu.edu through the study of isolated genes or pathways. One such systems-level feature is that of robustness, which represents biological systems ability to function in a wide range of environments and in the context of component failure. One particular important aspect of metabolic network robustness is its ability to buffer essential functions of the organism against random gene deletion. Flux balance analysis provides a powerful tool to examine metabolic network robustness at the genome-scale [10]. A variant of flux balance analysis, called producibility analy- sis, employs linear programming to identify the metabolite knockouts that are predicted to result from a gene knockout, given the genome-scale model and a nutrient media [6]. This set of metabolite knockouts resulting from a gene deletion provides a global measure of that gene deletion’s effect on network function, which we term as the metabolomic impact. Producibility analysis in E. coli shows its biosynthetic function to be highly robust to single gene deletion in rich media. Alternatively stated, most single-gene deletions in this strain and nutrient media have no metabolomic impact [6]. This robustness is thought to arise at three levels: gene, protein, and pathway. Robustness at the gene level is attributed to gene duplication. Robustness at the protein level results from multiple enzymes performing identical functions. Pathway-based robustness occurs when multiple pathways in the metabolic network achieve the same objec- tive. In this study, we propose a new layer of mechanisms underlying E. coli metabolic robustness at the genome-scale. Namely, we postulate that the position of genes in the genome has evolved to buffer the organism against random deletions. To test this hypothesis, we apply a novel and efficient implementation of producibility analysis to evaluate the biosynthetic robustness of the E. coli metabolic network to random genomic deletion. By comparing these results to those obtained from ”permuted genomes”, we demonstrate that the position of genes in E. coli significantly protects metabolites against gene deletion. This result has interesting implications for the understanding of metabolic network evolution. II. METHODS A. Genome scale metabolic models Notation For n, i N, we use I n to denote the n × n identity matrix, and e n,i R n to denote the i-th element of the Euclidean basis in R n . Given m, n N, we use the notation M = {1,...,m} and N = {1,...,n}. For a set C, we use |C| to denote its cardinality. If A R m×n and Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 2008 TuB05.6 978-1-4244-3124-3/08/$25.00 ©2008 IEEE 793