Neural network Reinforcement Learning for visual control of robot manipulators Zoran Miljkovic ´ a,⇑ , Marko Mitic ´ a , Mihailo Lazarevic ´ b , Bojan Babic ´ a a University of Belgrade, Faculty of Mechanical Engineering, Production Engineering Department, Kraljice Marije 16, 11120 Belgrade 35, Serbia b University of Belgrade, Faculty of Mechanical Engineering, Department of Mechanics, Kraljice Marije 16, 11120 Belgrade 35, Serbia article info Keywords: Reinforcement Learning Neural network Robot manipulator Image Based Visual Servo control Intelligent hybrid control abstract It is known that most of the key problems in visual servo control of robots are related to the performance analysis of the system considering measurement and modeling errors. In this paper, the development and performance evaluation of a novel intelligent visual servo controller for a robot manipulator using neural network Reinforcement Learning is presented. By implementing machine learning techniques into the vision based control scheme, the robot is enabled to improve its performance online and to adapt to the changing conditions in the environment. Two different temporal difference algorithms (Q-learning and SARSA) coupled with neural networks are developed and tested through different visual control sce- narios. A database of representative learning samples is employed so as to speed up the convergence of the neural network and real-time learning of robot behavior. Moreover, the visual servoing task is divided into two steps in order to ensure the visibility of the features: in the ﬁrst step centering behavior of the robot is conducted using neural network Reinforcement Learning controller, while the second step involves switching control between the traditional Image Based Visual Servoing and the neural network Reinforcement Learning for enabling approaching behavior of the manipulator. The correction in robot motion is achieved with the deﬁnition of the areas of interest for the image features independently in both control steps. Various simulations are developed in order to present the robustness of the developed system regarding calibration error, modeling error, and image noise. In addition, a comparison with the traditional Image Based Visual Servoing is presented. Real world experiments on a robot manipulator with the low cost vision system demonstrate the effectiveness of the proposed approach. Ó 2012 Elsevier Ltd. All rights reserved. 1. Introduction In recent years, a wide variety of applications regarding autono- mous robot behavior in unstructured and unknown environments have been developed. Similarly to biological systems, new genera- tions robots are able to learn and to adapt to changing conditions in real time. This property is necessary when facing difﬁcult tasks in practice such as search and rescue missions, reconnaissance, sur- veillance, and inspection in complex and dangerous surroundings. The possibility of robot vision can be crucial in these assignments since it mimics human sense and allows for noncontact measure- ment of the environment (Hutchinson, Hager, & Corke, 1996). Visual servoing or visual servo control represents a known solu- tion to control the motion of robot manipulators in structured environments. It involves various techniques from image process- ing, computer vision, and control theory (Chaumette & Hutchinson, 2006). By using these approaches, sophisticated autonomous sys- tems containing low cost sensors and actuators can be developed. In visual servo control, the information from one or more cameras is used within the control loop in order to ensure the desired posi- tion of the robot as required by a task. The vision data is acquired from the camera which is placed directly onto the manipulator (eye in hand conﬁguration) or in a ﬁxed position over the scene (eye to hand conﬁguration). In the next step, the features on the image plane are servo controlled to their goal positions. It is well known that points are the simplest features that can be extracted from an image, both from a geometrical and an image processing point of view (Fomena, Omar, & Chaumette, 2011). Therefore, most of the applications in visual servoing are based on an image of points, such as visual homing (Basri, Rivlin, & Shimshoni, 1999), navigation and path planning of the robot manipulators (Cowan & Koditschek, 1999; Mezouar & Chaumette, 2002), mobile robot navigation (Ma, Kosecka, & Sastry, 1999; Mariottini, Oriolo, & Prattichizzo, 2007; Miljkovic ´ , Vukovic ´, Mitic ´, & Babic ´, in press), and stabilization of aerial vehicles (Ceren & Altug ˘, 2012; Hamel & Mahony, 2002). One of the many challenges in visual servo control includes the maintenance of visual features within the ﬁeld of view of the cam- era. Also, robustness to camera calibration parameters and image noise is very important in real world applications. Likewise, un- known disturbances during the motion as well as the robot motion itself can result in none of the visual features in the image plane. In 0957-4174/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.eswa.2012.09.010 ⇑ Corresponding author. Tel.: +381 11 3302 468; fax: +381 11 3370 364. E-mail address: zmiljkovic@mas.bg.ac.rs (Z. Miljkovic ´). URL: http://cent.mas.bg.ac.rs/english/staff/zmiljkovic.htm (Z. Miljkovic ´). Expert Systems with Applications 40 (2013) 1721–1736 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa