IFAC PapersOnLine 50-1 (2017) 5953–5960 ScienceDirect ScienceDirect Available online at www.sciencedirect.com 2405-8963 © 2017, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. Peer review under responsibility of International Federation of Automatic Control. 10.1016/j.ifacol.2017.08.1498 © 2017, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. Keywords: Dynamic programming, Optimal control, Global optimization, Nonlinear control, Bang-bang control, Efficiency enhancement 1. INTRODUCTION Non-causal global nonlinear constrained optimal control is a notoriously difficult problem which, in general, does not have a known analytical solution. Hence, it is often necessary to use numerical methods. One method that is often used for finite-dimensional problems is dynamic programming (DP). For example, DP is often used for designing hybrid vehicle controllers, where DP is typically used to benchmark the quality of simpler, suboptimal, causal controllers (Liu and Peng (2008); P´ erez et al. (2006); Sciarretta and Guzzella (2007)). DP is guaranteed to generate the global optimum for problems that can be represented in a graph. However, DP is computationally complex for multidimensional problems, where the required number of computations scales exponentially with the number of dimensions. This paper presents a new DP method (and an imple- mentation of it in Matlab) for multidimensional problems that can be described as a loosely coupled set of ordinary differential/difference equations with different time scales. For problems of this type the presented method significantly reduces the time required to generate a solution, and This work has been performed within the Combustion Engine Research Center at Chalmers (CERC) with financial support from the Swedish Energy Agency. furthermore increases the likelihood of generating a feasible solution. One example of an application that this method works well for is that of hybrid vehicle control, where performance gains on the order of the quotient of the system’s time scales are realizable. Typically, this gives a performance improvement on the order of 10 2 - 10 4 . In this paper DP is used to solve a discrete-valued, discrete- time approximation of a continuous-value problem in discrete- or continuous-time. The DP method used in this paper starts with a backward-calculation phase where, for a sample k, each element from a set of system inputs U k is exhaustively applied to each element of a set of system states X k . The best control u opt [k] and corresponding cost c opt [k] are stored for every system state, where the best control and cost minimizes the total cost from the current sample to the final sample. This process is repeated for all samples starting from the next-to-last sample and working backwards to the first sample. The optimal control and state trajectories are generated in a forward-calculation phase, where for a given initial state the best stored control signal u opt [k] is successively applied to the system state x [k] for all samples. Interpolation is used when the system state x [k] does not exactly match one of the states evaluated during the back-calculation phase. This directly gives the optimal control and state trajectories u opt [k] and x opt [k]. A formal definition of DP for optimal control is beyond the * Dept. of Signals and Systems, Chalmers University of Technology, SE-412 96 Gothenburg, Sweden, lock@chalmers.se ** Dept. of Signals and Systems, Chalmers University of Technology, SE-412 96 Gothenburg, Sweden, mckelvey@chalmers.se Abstract: Iterative dynamic programming is a powerful method that is often used to solve finite-dimensional nonlinear constrained global optimal control problems. However, multi- dimensional problems are often computationally complex, and in some cases an infeasible result is generated despite the existence of a feasible solution. A new iterative multi-pass method is presented that reduces the execution time of multi-dimensional, loosely-coupled, dynamic programming problems, where some state variables exhibit dynamic behavior with time scales significantly smaller than the others. One potential application is the optimal control of a hybrid electrical vehicle, where the computational burden can be reduced by a factor on the order of 100 – 10000. Furthermore, new regularization terms are introduced that typically improve the likelihood of generating a feasible optimal trajectory. Though the regularization terms may generate suboptimal solutions in the interim, with successive iterations the generated solution typically asymptotically approaches the true optimal solution. Note: Full source code is freely available online with an implementation of the solver, some usage examples, and the test cases used to generate the results shown in this paper. Jonathan Lock * Tomas McKelvey ** A Computationally Fast Iterative Dynamic Programming Method for Optimal Control of Loosely Coupled Dynamical Systems with Different Time Scales