Please cite this article in press as: Valenzuela, P.E., et al., Analysis of averages over distributions of Markov processes. Automatica (2018), https://doi.org/10.1016/j.automatica.2018.09.016. Automatica ( ) Contents lists available at ScienceDirect Automatica journal homepage: www.elsevier.com/locate/automatica Technical communique Analysis of averages over distributions of Markov processes Patricio E. Valenzuela *, Cristian R. Rojas, Håkan Hjalmarsson Department of Automatic Control and ACCESS Linnaeus Center, School of Electrical Engineering, KTH Royal Institute of Technology, SE-100 44 Stockholm, Sweden article info Article history: Received 28 July 2017 Received in revised form 25 March 2018 Accepted 19 July 2018 Available online xxxx Keywords: System identification Input design Markov chains abstract In problems of optimal control of Markov decision processes and optimal design of experiments, the occupation measure of a Markov process is designed in order to maximize a specific reward function. When the memory of such a process is too long, or the process is non-Markovian but mixing, it makes sense to approximate it by that of a shorter memory Markov process. This note provides a specific bound for the approximation error introduced in these schemes. The derived bound is then applied to the proposed solution of a recently introduced approach to optimal input design for nonlinear systems. © 2018 Elsevier Ltd. All rights reserved. 1. Introduction The field of Markov Decision Processes (MDP) is a very mature area of research (Puterman, 1994), where the goal is usually to design the action policy in order to maximize an average or dis- counted reward function. There are two main, dual approaches to solve these problems, namely, through dynamic programming (Bellman, 1957) or via occupation measures (Borkar, 1988) (which correspond to the stationary probabilities of the joint action–state pair); the latter approach has some advantages over the former, for example, for MDP problems subject to average constraints (Altman, 1999). The solution scheme for MDPs based on occupation measures has found applications in other control-related problems such as a nonlinear optimal control (Lasserre, Prieur, & Henrion, 2005; Vaidya, Mehta, & Shanbhag, 2010), stability analysis (Vaidya & Mehta, 2008) and optimal input design for nonlinear systems (Valenzuela, Rojas, & Hjalmarsson, 2015). For some of these problems, in particular for input design (Valenzuela et al., 2015), the Markovian assumption is actually an approximation, in the sense that the stochastic process (the output of a nonlinear system) is a mixing process, so the conditional dis- tribution of current value, given the entire past, is approximately equal to the conditional distribution given a finite number of values This work was supported by the Swedish Research Council under contracts 621-2011-5890 and 621-2009-4017. The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor A. Pedro Aguiar under the direction of Editor André L. Tits. * Corresponding author. E-mail addresses: pva@kth.se (P.E. Valenzuela), crro@kth.se (C.R. Rojas), hjalmars@kth.se (H. Hjalmarsson). of its most recent past. This approximation is indeed essential for the application of the occupation measure approach. In this paper we revisit the validity of this Markovian approxi- mation. In particular, we consider a mixing process corresponding to the output of an exponentially forgetting system (Ljung, 1978) driven by a Markov process of finite memory, and we provide rig- orous bounds on the difference between the mean of a function of this process and that of its Markovian approximation, as a function of the length of its memory. The bound obtained tends to zero as the memory length grows to infinity, establishing the validity of the Markovian approximation. Also, we apply this bound to the input design approach in Valenzuela et al. (2015), and provide a bound of the accuracy of that procedure. The structure of this article is as follows. Section 2 presents preliminaries on Markov processes. Section 3 introduces the main results of this manuscript. Finally, Section 4 presents concluding remarks. 2. Preliminaries on Markov process This section introduces the elements from the theory of Markov processes required in the main result of this note (Theorem 1). A Markov process is a stochastic process where its conditional probability distributions given its past only depend on its nearest past with probability one (Doob, 1953, p. 80). In the following, we consider a Markov process defined for all t 0 as x t +1 p(x t +1 |x t ) , (1) where p is a conditional probability mass function, x t X for all t 0, and X is a finite set. Based on (1), we recursively define the https://doi.org/10.1016/j.automatica.2018.09.016 0005-1098/© 2018 Elsevier Ltd. All rights reserved.