Modeling Multiple-mode Systems with Predictive State Representations Britton Wolfe Computer Science Indiana University-Purdue University Fort Wayne wolfeb@ipfw.edu Michael R. James AI and Robotics Group Toyota Technical Center michael.r.james@gmail.com Satinder Singh Computer Science and Engineering University of Michigan baveja@umich.edu Abstract— Predictive state representations (PSRs) are a class of models that represent the state of a dynamical system as a set of predictions about future events. This work introduces a class of structured PSR models called multi-mode PSRs (MMPSRs), which were inspired by the problem of modeling traffic. In general, MMPSRs can model uncontrolled dynamical systems that switch between several modes of operation. An important aspect of the model is that the modes must be recognizable from a window of past and future observations. Allowing modes to depend upon future observations means the MMPSR can model systems where the mode cannot be determined from only the past observations. Requiring modes to be defined in terms of observations makes the MMPSR different from hierarchical latent-variable based models. This difference is significant for learning the MMPSR, because there is no need for costly estimation of the modes in the training data: their true values are known from the mode definitions. Furthermore, the MMPSR exploits the modes’ recognizability by adjusting its state values to reflect the true modes of the past as they become revealed. Our empirical evaluation of the MMPSR shows that the accuracy of a learned MMPSR model compares favorably with other learned models in predicting both simulated and real-world highway traffic. Index Terms— highway traffic, predictive state representa- tions, dynamical systems I. MULTI - MODE PSRS (MMPSRS) Predictive state representations (PSRs) [3] are a class of models that represent the state of a dynamical system as a set of predictions about future events. PSRs are capable of representing partially observable, stochastic dynamical systems, including any system that can be modeled by a finite partially observable Markov decision process (POMDP) [5]. There is evidence that predictive state is useful for general- ization [4] and helps to learn more accurate models than the state representation of a POMDP [7]. This work introduces a class of structured hierarchical PSR models called multi- mode PSRs (MMPSRs) for modeling uncontrolled dynamical systems that switch between several modes of operation. Unlike latent-variable models like hierarchical HMMs [2], the MMPSR requires that the modes be a function of past and future observations. This requirement yields advantages both when learning and using an MMPSR, as explained throughout this section. The MMPSR is inspired by the problem of predicting cars’ movements on a highway. One way to predict the car’s movements would be to determine what mode of behavior the car was in — e.g., a left lane change, right lane change, or going straight — and make predictions about the car’s movement conditioned upon that mode of behavior. The MMPSR makes predictions in this way using two component models which form a simple, two-level hierarchy (Figure 1). When modeling highway traffic, the high-level model will predict the mode of behavior, and the low-level model will make predictions about the car’s future positions conditioned upon the mode of behavior. The remainder of this section formalizes the MMPSR model in general terms, making it applicable to dynamical systems other than highway traffic. A. Observations and Modes The MMPSR can model uncontrolled, discrete-time dy- namical systems, where the agent receives some observation O i at each time step i =1, 2,.... The observations can be vector-valued and can be discrete or continuous. In addition to the observations that the agent receives from the dynamical system, the MMPSR requires that there exists a discrete set of modes the system could be in, and that there is some mode associated with each time step. The system can be in the same mode for several time steps, so a single mode can be associated with multiple contiguous time steps. The i th mode since the beginning of time will be denoted by ψ i , and ψ(τ ) will denote the mode for the τ time step (Figure 2). What distinguishes MMPSR models from hierarchical latent-variable models (e.g., hierarchical HMMs [2]) is the fact that the modes are not latent. Instead, they are de- fined in terms of past and possibly future observations. Specifically, the modes used by an MMPSR must sat- isfy the following recognizability requirement: There is some finite k such that, for any sequence of observations O 1 ,...,O τ ,O τ +1 ,...O τ +k (for any τ ≥ 0), the modes ψ(1),...,ψ(τ -1),ψ(τ ) are known at time τ +k (or before). We say that a mode ψ(τ ) is known at time τ ′ (where τ ′ can be greater than τ ) if the definitions of the modes and the observations O 1 ,...,O τ ′ from the beginning of time through time τ ′ unambiguously determine the value of ψ(τ ). To reiterate, the recognizability requirement differentiates the MMPSR from hierarchical latent-variable models. If one were to incorporate the fact that modes were recognizable into a hierarchical latent-variable model, one would in effect get an MMPSR. The recognizability of the modes plays a crucial role in learning an MMPSR, because the modes for the batch of training data are known. If the modes were not recognizable, one would have to use the expectation- maximization algorithm to estimate the latent modes, as is