ARTICLE IN PRESS JID: NEUCOM [m5G;January 25, 2019;10:4] Neurocomputing xxx (xxxx) xxx Contents lists available at ScienceDirect Neurocomputing journal homepage: www.elsevier.com/locate/neucom Robust control design for multi-player nonlinear systems with input disturbances via adaptive dynamic programming Qiuxia Qu a,b , Huaguang Zhang a,∗ , Chaomin Luo c , Rui Yu a a College of Information Science and Engineering, Northeastern University, Shenyang, 110819, China b College of Information and Control Engineering, Shenyang Jianzhu University, Shenyang, Liaoning 100168, China c Department of Electrical and Computer Engineering, University of Detroit Mercy, Michigan, USA a r t i c l e i n f o Article history: Received 16 February 2017 Revised 13 November 2017 Accepted 18 November 2018 Available online xxx Communicated by Hak Keung Lam Keywords: Adaptive dynamic programming (ADP) Input disturbances Multi-player systems Nonzero-sum games Nash equilibrium Neural network a b s t r a c t In this paper, a novel robust control strategy based on adaptive dynamics programming (ADP) technique is proposed for multi-player nonlinear systems with input disturbances. A pair of robust control poli- cies is constructed by multiplying appropriate coupling gains to the Nash solution of nominal nonlinear nonzero-sum game with predeﬁned cost functions accounting for system uncertain disturbances. Suﬃ- cient conditions for the existence of robust control strategy are derived, and it is proved that the robust control strategy can guarantee the multi-player nonlinear systems to be stable in the sense of uniform ultimate boundedness (UUB) with disturbance rejection. The single-network ADP algorithm is employed to solve the coupled Hamilton–Jacobi equations, where only requires to online tune the weights of critic neural networks (NN) for each player. By utilizing Lyapunov theory, the NN weight estimation errors are proved to be uniformly ultimately bounded, while the stability of the closed-loop nonzero-sum game system is also guaranteed. Two numerical experiments are given to demonstrate the effectiveness of the proposed approach. © 2018 Elsevier B.V. All rights reserved. 1. Introduction Multi-player systems have recently gained increasing attention due to their extensive real world applications in both science and engineering, such as microgrid systems [1], autonomous guided ve- hicles [2], Pursuit–Evasion game [3], as well as missile guidance, military strategy, aircraft control and aerial tactics [4]. For multi- input systems, each input is computed by a player, and each player tries to inﬂuence the system state to minimize its own cost func- tion. In this case, the optimization problem for each player is cou- pled with the optimization problem for other players. Differen- tial game theory provides solution concepts for many multi-player, multi-objective optimization problems [5]. As one important part of game theory, the objective of nonzero-sum game is to ﬁnd a pair of optimal policies called Nash solution. Many researchers have studied nonzero-sum game by using various analysis methods including game theory introduced in [6], and Lyapunov functions. As one class of reinforcement learn- ing techniques, ADP methods have been studied widely for the forward-in-time solution of the Hamilton–Jacobi–Bellman (HJB) ∗ Corresponding author. E-mail addresses: quqiuxia2010@163.com (Q. Qu), hgzhang@ieee.org (H. Zhang), chaominluo@yahoo.com (C. Luo), yuruineu@126.com (R. Yu). equation or coupled Hamilton–Jacobi (HJ) equation to the optimal control problems in stochastic systems [7,8], differential games [9– 14] as well as other ﬁelds [15–19]. And a large number of evi- dent results for practical applications have been obtained [20–22]. The ADP methods can be categorized into two classes, i.e., pol- icy iteration (PI) and value iteration (VI) algorithms [23–29,32,33]. Recently, both the PI and VI were provided using only the input- output data to obtain an optimal controller for unknown discrete- time linear system in [30], and these methods were extended to develop a linear tracking controller in [31]. It is worth mention- ing that many researchers have studied the robust adaptive con- trol problem for linear and nonlinear systems subjected to exter- nal disturbances via ADP methods [34–38]. Besides the aforemen- tioned works, inspired by Adhyaru et al. [39], Wang et al. designed a robust controller by multiplying a proper gain, and proved that it not only makes the system with uncertain disturbances achieve uniformly ultimately bounded but also is optimal [40]. Recently, several adaptive control algorithms have been applied to resolve the nonzero-sum game for linear and nonlinear sys- tems to obtain the feed-back Nash equilibrium. In [41], an online PI learning algorithm was proposed for nonlinear system which can update the critic-actor structure for each player, synchronously and simultaneously. Zhang et al. presented an ADP algorithm for two-player nonlinear nonzero-sum games with a single network https://doi.org/10.1016/j.neucom.2018.11.054 0925-2312/© 2018 Elsevier B.V. All rights reserved. Please cite this article as: Q. Qu, H. Zhang and C. Luo et al., Robust control design for multi-player nonlinear systems with input distur- bances via adaptive dynamic programming, Neurocomputing, https://doi.org/10.1016/j.neucom.2018.11.054