ALTERING STEADY-STATE PROBABILITIES IN PROBABILISTIC BOOLEAN NETWORKS Ranadip Pal a , Aniruddha Datta a , Edward R. Dougherty a,b a Texas A & M University, Electrical and Computer Engineering, College Station, TX, 77843, USA. b Translational Genomics Research Institute, Phoenix, AZ, 85004, USA ABSTRACT External control of a genetic regulatory network is used for the purpose of avoiding undesirable states, such as those as- sociated with disease. Heretofore, intervention has focused on finite-horizon control, i.e., control over a small number of stages. This paper considers the design of optimal infinite- horizon control for Probabilistic Boolean Networks (PBNs). The stationary policy obtained is independent of time and de- pendent on the current state. The average-cost-per-stage prob- lem formulation is used to generate the stationary policy for a PBN constructed from melanoma gene-expression data. The results show that the stationary policiy obtained is capable of shifting the probability mass of the stationary distribution from undesirable states to desirable ones. 1. INTRODUCTION Two major goals of functional genomics are to develop method- ologies for diagnosing the presence or type of disease and to develop therapies based on the disruption or mitigation of aberrant gene function contributing to the pathology of a dis- ease. Mitigation would be accomplished by the use of drugs to act on the gene products. Engineering therapeutic tools in- volves synthesizing nonlinear dynamical networks, analyzing these networks to characterize gene regulation, and develop- ing intervention strategies to modify dynamical behavior . To date, external intervention has been studied in the context of Probabilistic Boolean Networks (PBNs) and has focused on manipulating external (control) variables that affect the tran- sition probabilities of a PBN to desirably affect its dynamic evolution over a finite time horizon [1, 2] . All of these re- sults have been obtained by exploiting the fact that the dy- namic behavior of the PBN can be modeled by a Markov Chain, thereby making the PBN amenable to the theory of Markov chains and Markov decision processes. These short term policies are not always effective in changing the steady state behavior of the PBN, even though they can change the dynamical performance of the network for a small number of stages. An approach to change the steady-state probabilities This research was supported by the National Science Founda- tion (ECS0355227 and CCF0514644) and the National Cancer Institute (CA90301) by modifying the underlying rule-based structure was con- sidered in [3] using genetic algorithms. In this paper, we consider intervention via external control variables in PBNs over an infinite length of time. We desire a control policy that does not change from one time step to the next because implementations of such stationary policies are often simple and stationary policies can be used to shift the steady-state distribution from undesirable states to desirable ones. Calculating the optimal stationary policy is quite involved. In case of finite-horizon control, we can use a backward dy- namic programming algorithm and terminate it once the first stage is reached; in case of infinite-horizon control, this ap- proach cannot be used. The optimization of the total cost rests on the fact that the total cost is finite for at least some control policy; however, if there is no termination state (state with zero cost), then the total cost may tend to infinity. This is the case with PBNs (ergodic Markov chains). Hence we em- ploy the average cost approach. The average cost approach divides the total cost by the number of stages, a normalization that prevents the divergence of the cost to infinity. A PBN with control can be modeled as a stationary discrete- time dynamic system z t+1 = f (z t ,u t ,w t ),t =0, 1, ...., (1) where for all t, the state z t is an element of a space S , the control input u t is an element of a space C, the disturbance w t is an element of a space D and f : S × C × D S. The average cost per stage is defined by J π (z 0 )= lim M→∞ 1 M E{ M-1 t=0 ˜ g(z t t (z t ),w t )}. (2) Where μ t : S C, t =0, 1, 2, ··· ,M 1 are functions mapping the state space into the control space, i.e. the controls considered are state feedbacks; ˜ g t (z t t (z t ),w t ) is the cost per stage. Let us denote by Π the set of all admissible policies π, i.e., the set of all sequences of functions π = μ 0 1 , .... with