Multilevel modeling for inference of genetic regulatory networks Shu-Kay Ng a , Kui Wang b and Geoffrey J. McLachlan a,b,c a Department of Mathematics, University of Queensland, Brisbane, QLD 4072, Australia; b ARC Centre for Complex Systems, University of Queensland, Brisbane, QLD 4072, Australia; c Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD 4072, Australia ABSTRACT Time-course experiments with microarrays are often used to study dynamic biological systems and genetic regulatory networks (GRNs) that model how genes influence each other in cell-level development of organisms. The inference for GRNs provides important insights into the fundamental biological processes such as growth and is useful in disease diagnosis and genomic drug design. Due to the experimental design, multilevel data hierarchies are often present in time-course gene expression data. Most existing methods, however, ignore the dependency of the expression measurements over time and the correlation among gene expression profiles. Such independence assumptions violate regulatory interactions and can result in overlooking certain important subject effects and lead to spurious inference for regulatory networks or mechanisms. In this paper, a multilevel mixed-effects model is adopted to incorporate data hierarchies in the analysis of time-course data, where temporal and subject effects are both assumed to be random. The method starts with the clustering of genes by fitting the mixture model within the multilevel random-effects model framework using the expectation-maximization (EM) algorithm. The network of regulatory interactions is then determined by searching for regulatory control elements (activators and inhibitors) shared by the clusters of co-expressed genes, based on a time-lagged correlation coefficients measurement. The method is applied to two real time-course datasets from the budding yeast (Saccharomyces cerevisiae) genome. It is shown that the proposed method provides clusters of cell-cycle regulated genes that are supported by existing gene function annotations, and hence enables inference on regulatory interactions for the genetic network. Keywords: EM algorithm, Mixture models, Multilevel mixed-effects model, Genetic regulatory networks, Time- course data 1. INTRODUCTION In recent times, there has been an explosion in the development of comprehensive, high-throughput methods for molecular biology experimentation. The advent in DNA microarray technologies, such as complementary DNA (cDNA) arrays and oligonucleotide arrays, provides means for measuring tens of thousands of genes si- multaneously under different conditions. The complete description of the human genome and those of other organisms has been a major achievement of modern science. Microarray technologies promise to revolutionize our approaches in biomedical research and further our understanding of biological processes and the evaluations of gene expression patterns, regulation, and interactions. 1 The study of the dynamics of gene interactions is amongst the latest research directions in the post-genomic era in biology. It is well-known that the informa- tion encoded in DNA leads to the expression of certain phenotypes, or characteristics, in the organism. 2 The determination of genetic regulatory networks (GRNs) provides useful information on how genes influence each other in cell-level development of organisms. It can therefore provide important insights for the fundamental biological processes such as growth and is useful in disease diagnosis and genomic drug design. 2, 3 Time-course experiments with microarrays are used to measure gene expressions at several points in time as a serial study. Further author information: (Send correspondence to S.K. Ng) S.K. Ng: E-mail: skn@maths.uq.edu.au, Telephone: +617 3365 6139 K. Wang: E-mail: kwang@maths.uq.edu.au, Telephone: +617 3346 2623 G.J. McLachlan: E-mail: gjm@maths.uq.edu.au, Telephone: +617 3365 2150 Complex Systems, edited by Axel Bender, Proc. of SPIE Vol. 6039, 60390S, (2006) · 0277-786X/06/$15 · doi: 10.1117/12.638449 Proc. of SPIE Vol. 6039 60390S-1 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 11/18/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx