1 Sensor Scheduling for Observation of a Markov Process Mohammad Rezaeian, Soﬁa Suvorova, Bill Moran Department of Electrical and Electronic Eng. University of Melbourne Victoria, 3010, Australia Email: rezaeian,s.suvorova,b.moran@ee.unimelb.edu.au Abstract We study the optimal scheduling of a set of sensors for observation of a Markov process. The Markov process is called the state process which in conjunction with the measurement processes as the readout of sensors deﬁnes a generalized hidden Markov process. The dynamics of state process is characterized by a transition probability matrix P , while various sensors for the observation of state are characterized through a set of observation probability matrices T k (k =1, 2, ..., K). The criterion for optimality is to minimize the conditional entropy of state given past measurements for sufﬁciently large number of measurements, thus the optimal scheduling provides the minimum ambiguity about the state given all the past read-out of sensors selected based on that schedule. A schedule is characterized by a particular partitioning of the probability simplex associated with the state estimation. The selection of sensors based on a schedule at each epoch depends on the partition that contains the probability of state by the current state estimate. We seek a stationary solution to this scheduling problem. I. I NTRODUCTION For the purpose of this paper, a Hidden Markov Process {S n } ∞ n=0 is generalized to be a process deﬁned by [S, P, T k (k =1, 2,...,K ),Z ], where S and Z are sets of possible states and measurement outcomes, respectively, P is a transition probability matrix, and T k (k =1, 2,...,K ) are observation probability ma- trices. In contrast to the usual hidden Markov process [1]-[2], which has only one observation probability matrix, here the measurement Z n at time n is related to the state S n through the observation probability matrix T (k n ) which varies with time index n. The purpose of this paper is to ﬁnd an optimal policy to chose the measurement sensor k n based on current estimate of state and with the goal to achieve minimum entropy for state. This problem arises in applications for optimal usage of a set of sensors for observation of a Markov process where the systems or the resources management can only afford one sensor at a time. For example, in a radar system only one waveform out of a set can be used at each pulse transmission [3]. For a hidden Markov process deﬁned as above, let Δ be the space of probability measures on the state space S , ie: the set of vectors π of positive real numbers with sum equal to 1 and let P (Δ) be the space of probability distributions on Δ. In this paper the probability Pr(X = x) is shown by p(x) (similarly for conditional probabilities), whereas p (X ) represents a row vector as the distribution of X , ie: the k-th element of the vector p (X ) is Pr(X = k). We also denote the history of all measurements up to and including time n - 1 by Z n-1 . We adopt the concept of the information state from the Partially Observed Markov Decision Processes (POMDP) [4],[5]. We denote the information state by π n as a random variable on Δ given by π n (Z n-1 )= p (S n |Z n-1 ), (1) This work was supported in part by the Defense Advanced Research Projects Agency of the US Department of Defense and was monitored by the Ofﬁce of Naval Research under Contract No. N00014-04-C-0437.