This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 1 Reinforcement Learning-Based Differential Evolution With Cooperative Coevolution for a Compensatory Neuro-Fuzzy Controller Cheng-Hung Chen , Member, IEEE, and Chong-Bin Liu Abstract— This paper presents the integration of reinforcement learning-based differential evolution (DE) with the cooperative coevolution (R-CCDE) method in a compensatory neuro-fuzzy controller (CNFC). The CNFC model employs compensatory fuzzy operations, which increase the adaptability and effec- tiveness of the controller. The R-CCDE method was used to determine an adequate control policy for nonlinear system problems. The evolution of a population involved the use of DE with cooperative coevolution to adjust CNFC parameters, and the fitness function of the R-CCDE method is used by a reinforcement signal to determine the controller that can be used to solve the control problem. This paper identified the best performing con- troller to solve nonlinear system problems. The simulation results of the proposed R-CCDE method were compared with those of various DE methods and the performance of the proposed R-CCDE method was superior to that of the other methods. Index Terms— Cooperative coevolution, differential evolu- tion (DE), neuro-fuzzy controller (NFC), nonlinear system problems, reinforcement learning. I. I NTRODUCTION I N RECENT years, numerous studies [1]–[5] have applied intelligent control methods to solve nonlinear system con- trol problems. Artificial neural network controllers [6], [7] and fuzzy logic controllers [8]–[10] are typically used in intelligent control. Both artificial neural network controllers and fuzzy logic controllers can solve the aforementioned problems. How- ever, these two methods have several shortcomings. For exam- ple, artificial neural network controllers can quickly learn from training data and feedback propagation, but they cannot easily interpret each neuron and its weight in a network. Moreover, fuzzy logic controllers can easily interpret meaning because they apply linguistic terms and fuzzy IF–THEN rules, but their learning capability is inferior to that of artificial neural network controllers. Therefore, several researchers [11]–[15] have proposed neuro-fuzzy controllers (NFCs) that com- bine the advantages of artificial neural networks and fuzzy systems. NFCs combine the low-level learning of artificial neural networks and high-level human-like reasoning of fuzzy systems. Manuscript received November 4, 2015; revised May 31, 2016 and October 25, 2017; accepted November 1, 2017. This work was supported by the Ministry of Science and Technology, Taiwan, under Grant MOST 106-2221-E-150-057. (Corresponding author: Cheng-Hung Chen.) The authors are with the Department of Electrical Engineering, National Formosa University, Yunlin County 632, Taiwan (e-mail: chchen.ee@nfu.edu.tw). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNNLS.2017.2772870 The most common types of learning in NFCs are supervised learning and reinforcement learning. Supervised learning [16]–[20] is a common type of learning in NFCs, and it involves training network parameters according to training data, which play the role of supervisor in training the NFCs. In contrast to supervise learning, reinforcement learning [21]–[26] does not rely on training data to train NFCs. Instead, reinforcement learning enables identifying solutions through stochastic exploration in the search space. In this paper, evolutionary algorithms (EAs) were added to reinforcement learning. EAs [27]–[32] are heuristic and stochastic search algorithms that are often used for optimizing complex, multidimensional, and multimodal functions, where the actual functional form is unknown. A new EA, called differential evolution (DE), was developed by Storn and Price in 1995 [33]. The DE belongs to the broad class of EAs and possesses numerous advantages including a strong search capability and fast convergence in real-value probl- ems [33]–[36]. In recent years, the DE algorithm has gradually become the most common method in numerous practical applications, and several studies [37]–[40] have confirmed that DE is robust and strong. This paper proposes a reinforcement learning-based DE with cooperative coevolution (R-CCDE) approach to adjust the parameters of a compensatory NFC (CNFC) to obtain effective performance for solving nonlinear system problems. The CNFC is based on our previous research [41] with adaptive compensatory fuzzy reasoning to dynamically adjust fuzzy operators. The R-CCDE method integrates a population space, belief space [42], and cooperative coevolution [43]–[45] into DE in which the fitness function is used by a rein- forcement signal, increasing the performance and search capability during the learning phase. The proposed methods were applied in various nonlinear system control problems, and the simulation results proved their effectiveness. The remainder of this paper is organized as follows. Section II describes the compensatory fuzzy operation. Section III describes the structure of the CNFC. Section IV presents the proposed CCDE method, and Section V presents the proposed R-CCDE method. Section VI provides the simulation results of three nonlinear control problems. The conclusion is provided in Section VII. II. COMPENSATORY OPERATION Zhang and Kandel [46] proposed compensatory opera- tions based on the pessimistic operation and the optimistic 2162-237X © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.