Gait Optimization for Roombots Modular Robots - Matching Simulation and Reality Rico Moeckel, Yura N. Perov, Anh The Nguyen, Massimo Vespignani, St´ ephane Bonardi, Soha Pouya, Alexander Sproewitz, Jesse van den Kieboom, Fr´ ed´ eric Wilhelm, Auke Jan Ijspeert Abstract— The design of efﬁcient locomotion gaits for robots with many degrees of freedom is challenging and time con- suming even if optimization techniques are applied. Control parameters can be found through optimization in two ways: (i) through online optimization where the performance of a robot is measured while trying different control parameters on the actual hardware and (ii) through ofﬂine optimization by simulating the robot’s behavior with the help of models of the robot and its environment. In this paper, we present a hybrid optimization method that combines the best properties of online and ofﬂine optimization to efﬁciently ﬁnd locomotion gaits for arbitrary structures. In comparison to pure online optimization, both the number of experiments using robotic hardware as well as the total time required for ﬁnding efﬁcient locomotion gaits get highly reduced by running the major part of the optimization process in simulation using a cluster of processors. The presented example shows that even for robots with a low number of degrees of freedom the time required for optimization can be reduced by a factor of 2.5 to 30, at least, depending on how extensive the search for optimized control parameters should be. Time for hardware experiments becomes minimal. More importantly, gaits that can possibly damage the robotic hardware can be ﬁltered before being tried in hardware. Yet in contrast to pure ofﬂine optimization, we reach well matched behavior that allows a direct transfer of locomotion gaits from simulation to hardware. This is because through a meta-optimization we adapt not only the locomotion parameters but also the parameters for simulation models of the robot and environment allowing for a good matching of the robot behavior in simulation and hardware. We validate the proposed hybrid optimization method on a structure composed of two Roombots modules with a total num- ber of six degrees of freedom. Roombots are self-reconﬁgurable modular robots that can form arbitrary structures with many degrees of freedom through an integrated active connection mechanism. I. INTRODUCTION With an increasing number of degrees of freedom it becomes challenging and often even impossible to design and tune efﬁcient locomotion controllers by hand. Scalable controllers like Central Pattern Generators (CPGs) in com- bination with learning and optimization techniques allow for an automatic exploration of efﬁcient locomotion gaits in simulation [1] and hardware [2]. With their relatively low number of control parameters, CPGs can reduce the time required for gait optimization. However, also CPGs All authors are with the Biorobotics Laboratory, Ecole Polytech- nique F´ ed´ erale de Lausanne, Switzerland. Yura Perov is also with the Siberian Federal University, Institute of Mathematics and Computer Sci- ence, Russia, Krasnoyarsk. Corresponding authors: {rico.moeckel, auke.ijspeert}@epfl.ch cannot fully solve the problems that come with optimization techniques that are purely based on hardware or software experiments. Online optimization, where the optimization process is performed on the robotic hardware, is typically too time consuming for robotic structures with many degrees of free- dom. The parameter space exploration requires experiments running in real time and unless many robots with well- matched behavior are available the optimization process cannot be parallelized. Furthermore, online optimization can be dangerous for the robotic hardware since high impacts between robot and ground often cannot be predicted and get detected only during the actual experiment. Ofﬂine optimization allows the exploration of a variety of control parameters in simulation often faster than real time and in parallel since the optimization process can be performed on a cluster with many processors. Furthermore, time consuming processes including resetting the robot after each experiment as well as charging and replacing batteries can be avoided. Control parameters can be explored safely without the risk of damaging expensive robotic hardware. This is why the exploration of robot behavior in simulation is so popular. However, ofﬂine optimization has one major drawback that can make it poorly suited for ﬁnding control parameters for robotic hardware: Due to a lack of precision in the robot and environmental models, the optimized control parameters are typically not transferable from simulation to robotic hardware, a problem known as the ”reality gap”. A variety of researchers has been studying pure online and ofﬂine optimization of locomotion patterns for legged and modular robots [3]–[10]. Several other researchers have started targeting the prob- lem of reducing the reality gap. Lipson et. al [11], Glette et.al [12], and Coros et. al [13] have been presenting studies using quadruped robots. Adams has been using artiﬁcial evolution as a tool for generating controllers for physical robots [14]. Bongard et. al studied self-modeling machines [15]. A comparison of different strategies for simulator tuning was presented by Klaus et. al [16]. This paper explores the method of hybrid optimization as a solution to combine the advantages of the online and ofﬂine optimization process applied to a modular robot. Hybrid optimization is a cyclic method that avoids time consuming parameter optimization with hardware. Instead hybrid optimization aims at ﬁnding optimal control param- eters in simulation through simulation models that match well the robotic hardware and the environment (Fig. 1). In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) November 3-7, 2013. Tokyo, Japan 978-1-4673-6357-0/13/$31.00 ©2013 IEEE 3265