ARTICLE IN PRESS
JID: NEUCOM [m5G;January 25, 2019;10:4]
Neurocomputing xxx (xxxx) xxx
Contents lists available at ScienceDirect
Neurocomputing
journal homepage: www.elsevier.com/locate/neucom
Robust control design for multi-player nonlinear systems with input
disturbances via adaptive dynamic programming
Qiuxia Qu
a,b
, Huaguang Zhang
a,∗
, Chaomin Luo
c
, Rui Yu
a
a
College of Information Science and Engineering, Northeastern University, Shenyang, 110819, China
b
College of Information and Control Engineering, Shenyang Jianzhu University, Shenyang, Liaoning 100168, China
c
Department of Electrical and Computer Engineering, University of Detroit Mercy, Michigan, USA
a r t i c l e i n f o
Article history:
Received 16 February 2017
Revised 13 November 2017
Accepted 18 November 2018
Available online xxx
Communicated by Hak Keung Lam
Keywords:
Adaptive dynamic programming (ADP)
Input disturbances
Multi-player systems
Nonzero-sum games
Nash equilibrium
Neural network
a b s t r a c t
In this paper, a novel robust control strategy based on adaptive dynamics programming (ADP) technique
is proposed for multi-player nonlinear systems with input disturbances. A pair of robust control poli-
cies is constructed by multiplying appropriate coupling gains to the Nash solution of nominal nonlinear
nonzero-sum game with predefined cost functions accounting for system uncertain disturbances. Suffi-
cient conditions for the existence of robust control strategy are derived, and it is proved that the robust
control strategy can guarantee the multi-player nonlinear systems to be stable in the sense of uniform
ultimate boundedness (UUB) with disturbance rejection. The single-network ADP algorithm is employed
to solve the coupled Hamilton–Jacobi equations, where only requires to online tune the weights of critic
neural networks (NN) for each player. By utilizing Lyapunov theory, the NN weight estimation errors are
proved to be uniformly ultimately bounded, while the stability of the closed-loop nonzero-sum game
system is also guaranteed. Two numerical experiments are given to demonstrate the effectiveness of the
proposed approach.
© 2018 Elsevier B.V. All rights reserved.
1. Introduction
Multi-player systems have recently gained increasing attention
due to their extensive real world applications in both science and
engineering, such as microgrid systems [1], autonomous guided ve-
hicles [2], Pursuit–Evasion game [3], as well as missile guidance,
military strategy, aircraft control and aerial tactics [4]. For multi-
input systems, each input is computed by a player, and each player
tries to influence the system state to minimize its own cost func-
tion. In this case, the optimization problem for each player is cou-
pled with the optimization problem for other players. Differen-
tial game theory provides solution concepts for many multi-player,
multi-objective optimization problems [5]. As one important part
of game theory, the objective of nonzero-sum game is to find a
pair of optimal policies called Nash solution.
Many researchers have studied nonzero-sum game by using
various analysis methods including game theory introduced in
[6], and Lyapunov functions. As one class of reinforcement learn-
ing techniques, ADP methods have been studied widely for the
forward-in-time solution of the Hamilton–Jacobi–Bellman (HJB)
∗
Corresponding author.
E-mail addresses: quqiuxia2010@163.com (Q. Qu), hgzhang@ieee.org (H. Zhang),
chaominluo@yahoo.com (C. Luo), yuruineu@126.com (R. Yu).
equation or coupled Hamilton–Jacobi (HJ) equation to the optimal
control problems in stochastic systems [7,8], differential games [9–
14] as well as other fields [15–19]. And a large number of evi-
dent results for practical applications have been obtained [20–22].
The ADP methods can be categorized into two classes, i.e., pol-
icy iteration (PI) and value iteration (VI) algorithms [23–29,32,33].
Recently, both the PI and VI were provided using only the input-
output data to obtain an optimal controller for unknown discrete-
time linear system in [30], and these methods were extended to
develop a linear tracking controller in [31]. It is worth mention-
ing that many researchers have studied the robust adaptive con-
trol problem for linear and nonlinear systems subjected to exter-
nal disturbances via ADP methods [34–38]. Besides the aforemen-
tioned works, inspired by Adhyaru et al. [39], Wang et al. designed
a robust controller by multiplying a proper gain, and proved that
it not only makes the system with uncertain disturbances achieve
uniformly ultimately bounded but also is optimal [40].
Recently, several adaptive control algorithms have been applied
to resolve the nonzero-sum game for linear and nonlinear sys-
tems to obtain the feed-back Nash equilibrium. In [41], an online
PI learning algorithm was proposed for nonlinear system which
can update the critic-actor structure for each player, synchronously
and simultaneously. Zhang et al. presented an ADP algorithm for
two-player nonlinear nonzero-sum games with a single network
https://doi.org/10.1016/j.neucom.2018.11.054
0925-2312/© 2018 Elsevier B.V. All rights reserved.
Please cite this article as: Q. Qu, H. Zhang and C. Luo et al., Robust control design for multi-player nonlinear systems with input distur-
bances via adaptive dynamic programming, Neurocomputing, https://doi.org/10.1016/j.neucom.2018.11.054