A Methodological Framework for Robotic Reproduction of Observed Human Actions: Formulation using Latent Space Representation* Maria Koskinopoulou and Panos Trahanias Abstract— The current work presents a comprehensive methodological framework that facilitates robots to acquire human-like behavioral acts by observing human demonstrators. Accordingly, the introduced framework is established as a Learning from Demonstration (LfD) process that enables the reproduction of either learned or novel actions. Mapping of human actions to the respective robotic ones is achieved via an indeterminate depiction, termed latent space representation. The latter accomplishes a compact, yet precise abstraction of action trajectories, effectively representing high dimensional raw actions in a low dimensional space. Extensive experimen- tation with a real robotic arm demonstrates the robustness and applicability of the introduced framework. I. INTRODUCTION Humanoids can acquire human-like behaviors via a variety of learning processes. Such processes imply the development of a strategy, termed policy, that describes the mapping of a human behavior to a robotic one. Machine learning techniques have been proposed to cope with the inherent complexity of the policy formulation. A distinct approach to policy learning regards the so called Learning from Demon- stration (LfD), also referred in the literature as Imitation Learning or Programming by Demonstration (PbD) [1]. In this paper we introduce IMFO (IMitation Framework by Observation) as a novel LfD methodological framework to enable robots reproduce human actions, based on the coupling of perception and action, which is at the core of imitation learning. IMFO can cope with the reproduc- tion of learned (i.e. previously observed) actions, as well as novel ones. By modelling the reciprocal interaction of perception (actor’s world) and action (robot’s world), the proposed framework effectively accomplishes to map the observed actor’s space to the robot’s one by formulating an intermediate, latent space representation. Accordingly, IMFO succeeds in endowing robotic systems with human-like action capabilities. At first, an initial, obser- vation phase is formulated whereby, by means of kinesthetic teaching, a set of demonstrated motion acts is learned. In the learning process, respective human and robot actions are represented in the corresponding latent spaces and a mapping (association) across the latent spaces is established. In turn, a novel human action gives rise to a representation in the human latent space, which via the learned mapping *This work has been partially supported by the EU FET Proactive grant (GA: 641100) TIMESTORM - Mind and Time: Investigation of the Temporal Traits of Human-Machine Convergence. The authors are with the Institute of Computer Science, Foundation for Research and Technology - Hellas (FORTH) and the Department of Computer Science, University of Crete, Heraklion, Crete, Greece {mkosk,trahania} @ics.forth.gr is transformed to the robot’s one. The latter is inversely mapped to the robot’s action space effectively reproducing the observed behavior. The rest of this paper is organized as follows. In the following section a brief literature overview is presented. Following that, the proposed methodological framework is introduced in Section III. Experimental results from the application of IMFO in real scenarios are presented in Section IV. The paper concludes with a summary and ideas for future work in Section V. II. RELATED WORK Recent LfD works have either focused on the perceptual side of imitation by investigating robotic systems with low complexity (e.g. mobile robots, pick-and-place industrial robots) [2], [3], or on the motor end by assuming the existence of all necessary perceptual information and using a set of basic learned motion primitives [4]. LfD methods were originally promoted in order to achieve straightforward and accurate learning of robotic tasks, in contrast to tedious reinforcement learning methods or trial-and-error learning [5], [6]. In addition, robust and stable LfD facilitates the application of robots in everyday environments, as can be witnessed by the set of human-robot interaction (HRI) tasks that are recently included in the LfD literature [7], [8]. Simple manipulation tasks or advanced robot motions have been studied by combining and adapting various learning techniques and motion models [9]. Time-dependent dynami- cal systems [10], [11], autonomous dynamical systems [12], [13], polynomial and spline-based methods [14], nonlinear regression techniques are indicative examples of methods that have been used in the literature to tackle this problem. These methods have been successfully developed to learn motion primitives, such as discrete (point-to-point) motions and their extensions to obstacle avoidance, rhythmic and hitting motions, etc. [15]. In the current work we develop an LfD approach that relies heavily on a latent space representation of the actors’ configurations. Our formulation, termed IMFO, is capable of reproducing learned as well as novel (not previously mas- tered) actions. A further advantage over the state of the art methods for imitation learning is that the mapping between the observed and the action spaces is established indepen- dently of the teacher’s and the robot’s kinematics. In addition, LfD methods are usually faced with the correspondence problem across the two (teacher-learner) spaces [16]. This is greatly facilitated in our approach with the employment of the intermediate, latent space representation. The formulation