Information Theoretic Approaches for Detecting Causality in Gene Regulatory Networks Kangkan Medhi Dept of Computer Sc. & Engg. Indian School of Mines Dhanbad, India kangkanit@gmail.com Syed Sazzad Ahmed Dept of Information Technology North-Eastern Hill University Shillong, India sazzad.tezu@gmail.com Swarup Roy Dept of Information Technology North-Eastern Hill University Shillong, India swarup@nehu.ac.in Dhruba K Bhattacharyya Dept of Computer Sc. & Engg. Tezpur University Tezpur, India dkb@tezu.ernet.in Jugal K Kalita Dept of Computer Sc. University of Colorado Colorado Spring, USA jkalita@uccs.edu ABSTRACT Causality detection in gene regulatory networks (GRN) is a challenging and important task. Very few techniques have been proposed so far to infer causality in GRN. A major- ity of them adapts information theory as a measure to infer a causal relationship. In this work we evaluate the per- formance of information theoretic causality detection tech- niques in GRN. We consider two such measures, namely, Transfer En- tropy and Interaction Information and compare their per- formance with Granger causality, a statistical causality in- ference method. For evaluation, we use synthetic gold stan- dard data and underlying causal networks from DREAM challenges. Experimental results reveal that Interaction Information performs better in comparison to other candidate methods for inferring causality in GRN. It is also evident from the results that performance of information theoretic approaches is sensitive towards discretization method used. CCS Concepts •Applied computing → Computational biology; Recog- nition of genes and regulatory elements; Biological networks; Keywords Gene Regulatory Network; Causality; Inference; Mutual In- formation; Transfer Entropy 1. INTRODUCTION The genome encodes thousands of genes whose products Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full cita- tion on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. ICTCS ’16, March 04-05, 2016, Udaipur, India c 2016 ACM. ISBN 978-1-4503-3962-9/16/03. . . $15.00 DOI: http://dx.doi.org/10.1145/2905055.2905188 enable numerous cellular functions. The amounts and the temporal patterns in which these products appear in the cell are crucial to the processes of life. Gene regulatory networks govern the levels of these gene products and thus play a very important role in every process of life, including cell differen- tiation, metabolism, the cell cycle and signal transduction. In Gene Regulatory Network (GRN) a group of genes in a cell intercommunicate with each other and with other sub- stances like proteins, metabolites, etc., in the cell thereby regulating the rates at which genes in the network are tran- scribed into mRNA [7]. Computationally, quite often GRN is represented as a graph to describe the interactions among biomolecules. A node in a graph represents a biomolecule such as a gene, a protein or a metabolite, and an edge (or link) indicates the interaction between these two biomolecules. Such inter- actions may be physical interactions, metabolite flow, reg- ulatory relationships, co-expression relationships, etc. [11]. Depending upon the nature of the edges in GRN, we get two types of GRN: Directed with causal relationships and Undirected (also called as Gene Co-Expression Network). In contrast to gene co-expression networks, a GRN is a directed representation providing additional information giving di- rection of influence between two genes. In a GRN, causal information is an important component in deducing the ex- isting regulatory relationship between genes or gene prod- ucts. Causation describes the relationship that is present between a cause and its effect, where the later is an out- come of the former. Causation plays an important role in GRN and is depicted by a directed graph where the directed edges correspond to causal influences between gene-activities (nodes). A number of techniques have been proposed to infer causal- ity in general for various application domains. They are ei- ther statistical or information theoretic approaches. These methods have been applied in finding causality in GRN. Ma- jority of them are mainly based on information theory. In this work, we consider transfer entropy and interaction information measures for inferring causality and compare their performance with Granger causality measure.