J Intell Robot Syst (2008) 51:3–30 DOI 10.1007/s10846-007-9174-5 Hybrid Dynamic Control Algorithm for Humanoid Robots Based on Reinforcement Learning Du´ sko M. Kati ´ c · Aleksandar D. Rodi ´ c · Miomir K. Vukobratovi ´ c Received: 1 September 2006 / Accepted: 31 July 2007 / Published online: 25 September 2007 © Springer Science + Business Media B.V. 2007 Abstract In this paper, hybrid integrated dynamic control algorithm for humanoid locomotion mechanism is presented. The proposed structure of controller involves two feedback loops: model-based dynamic controller including impart-force con- troller and reinforcement learning feedback controller around zero-moment point. The proposed new reinforcement learning algorithm is based on modiﬁed version of actor-critic architecture for dynamic reactive compensation. Simulation experiments were carried out in order to validate the proposed control approach.The obtained simulation results served as the basis for a critical evaluation of the controller performance. Keywords Humanoid robots · Biped locomotion · Integrated dynamic control · Reinforcement learning · Actor–critic method 1 Introduction The contemporary humanoid robots are expected to be servants and maintenance machines with the main task to assist human activities in our daily life and to replace humans in hazardous operations. It is as obvious that anthropomorphic biped robots are potentially capable to effectively move in all unstructured environments where humans do. There are also strong anticipations that robots for the personal use will coexist with humans and provide supports such as the assistance for the housework, D. M. Kati ´ c(B ) · A. D. Rodi ´ c · M. K. Vukobratovi ´ c Robotics Department, Mihajlo Pupin Institute, Volgina 15, Belgrade 11060, Serbia e-mail: dusko@robot.imp.bg.ac.yu A. D. Rodi ´ c e-mail: roda@robot.imp.bg.ac.yu M. K. Vukobratovi ´ c e-mail: vuk@robot.imp.bg.ac.yu