IFAC PapersOnLine 50-1 (2017) 5953–5960
ScienceDirect ScienceDirect
Available online at www.sciencedirect.com
2405-8963 © 2017, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.
Peer review under responsibility of International Federation of Automatic Control.
10.1016/j.ifacol.2017.08.1498
© 2017, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.
Keywords: Dynamic programming, Optimal control, Global optimization, Nonlinear control,
Bang-bang control, Efficiency enhancement
1. INTRODUCTION
Non-causal global nonlinear constrained optimal control
is a notoriously difficult problem which, in general, does
not have a known analytical solution. Hence, it is often
necessary to use numerical methods. One method that
is often used for finite-dimensional problems is dynamic
programming (DP). For example, DP is often used for
designing hybrid vehicle controllers, where DP is typically
used to benchmark the quality of simpler, suboptimal,
causal controllers (Liu and Peng (2008); P´ erez et al. (2006);
Sciarretta and Guzzella (2007)). DP is guaranteed to
generate the global optimum for problems that can be
represented in a graph. However, DP is computationally
complex for multidimensional problems, where the required
number of computations scales exponentially with the
number of dimensions.
This paper presents a new DP method (and an imple-
mentation of it in Matlab) for multidimensional problems
that can be described as a loosely coupled set of ordinary
differential/difference equations with different time scales.
For problems of this type the presented method significantly
reduces the time required to generate a solution, and
⋆
This work has been performed within the Combustion Engine
Research Center at Chalmers (CERC) with financial support from
the Swedish Energy Agency.
furthermore increases the likelihood of generating a feasible
solution. One example of an application that this method
works well for is that of hybrid vehicle control, where
performance gains on the order of the quotient of the
system’s time scales are realizable. Typically, this gives a
performance improvement on the order of 10
2
- 10
4
.
In this paper DP is used to solve a discrete-valued, discrete-
time approximation of a continuous-value problem in
discrete- or continuous-time. The DP method used in this
paper starts with a backward-calculation phase where, for
a sample k, each element from a set of system inputs U
k
is exhaustively applied to each element of a set of system
states X
k
. The best control u
opt
[k] and corresponding cost
c
opt
[k] are stored for every system state, where the best
control and cost minimizes the total cost from the current
sample to the final sample. This process is repeated for all
samples starting from the next-to-last sample and working
backwards to the first sample. The optimal control and state
trajectories are generated in a forward-calculation phase,
where for a given initial state the best stored control signal
u
opt
[k] is successively applied to the system state x [k] for
all samples. Interpolation is used when the system state
x [k] does not exactly match one of the states evaluated
during the back-calculation phase. This directly gives the
optimal control and state trajectories u
opt
[k] and x
opt
[k].
A formal definition of DP for optimal control is beyond the
*
Dept. of Signals and Systems, Chalmers University of Technology,
SE-412 96 Gothenburg, Sweden, lock@chalmers.se
**
Dept. of Signals and Systems, Chalmers University of Technology,
SE-412 96 Gothenburg, Sweden, mckelvey@chalmers.se
Abstract: Iterative dynamic programming is a powerful method that is often used to solve
finite-dimensional nonlinear constrained global optimal control problems. However, multi-
dimensional problems are often computationally complex, and in some cases an infeasible result
is generated despite the existence of a feasible solution. A new iterative multi-pass method
is presented that reduces the execution time of multi-dimensional, loosely-coupled, dynamic
programming problems, where some state variables exhibit dynamic behavior with time scales
significantly smaller than the others. One potential application is the optimal control of a hybrid
electrical vehicle, where the computational burden can be reduced by a factor on the order of
100 – 10000. Furthermore, new regularization terms are introduced that typically improve the
likelihood of generating a feasible optimal trajectory. Though the regularization terms may
generate suboptimal solutions in the interim, with successive iterations the generated solution
typically asymptotically approaches the true optimal solution.
Note: Full source code is freely available online with an implementation of the solver, some usage
examples, and the test cases used to generate the results shown in this paper.
Jonathan Lock
*
Tomas McKelvey
**
A Computationally Fast Iterative Dynamic
Programming Method for Optimal Control
of Loosely Coupled Dynamical Systems
with Different Time Scales
⋆