Neural sensorimotor primitives for vision-controlled ﬂying robots Oswald Berthold and Verena V. Hafner I. Abstract Abstract — Vision-based control of aerial robots adds the challenge of visual sensing on top of the basic control problems. Here we pursue the approach of using sensor-coupled motion primi- tives in order to tie together both the sensor and motor components of ﬂying robots to leverage sensorimotor interaction in a fundamental man- ner. We also consider the temporal scale over which autonomy is to be achieved. Longer term autonomy requires signiﬁcant adaptive capacity of the respective control structures. We investigate linear combinations of some nonlinear network activity for the generation of motor output and ultimately, behaviour. In particular we are in- terested in how such networks can be set up to autonomously learn a desired behaviour. For that, we are looking at bootstrapping of motor primi- tives. The contribution of this work is a neurally inspired architecture for the representation of sensorimotor primitives capable of online learning on a ﬂying robot. II. Introduction Our aim is the design of control circuits for mul- tirotor type helicopters to safely and robustly move and navigate in unmodiﬁed everyday environments. This requires a highly adaptive architecture which can be realized by using Reinforcement Learning (RL) techniques in combination with autonomously acquired internal models of the robot to be con- trolled. The main sensory mode is vision, aided by several auxiliary channels, the overall dynamics of the underlying system, however, are largely unaf- fected by the choice of sensors. This course of action is motivated both by technical requirements and biological models. The general setting is to start the robot learning episode with a default parame- terization of the primitive, or policy. The policy is stochastic and the system follows the gradient of a cost function with respect to the policy parameters. The major challenge is to do this without the robot damaging or destroying itself. The control problems that we consider here are ba- sic stabilization of positions in space in the presence of continuous perturbation and noise. We consider the question: Can we bootstrap a controller based purely on robotic self-exploration with the marginal condition of not destroying the robot during learn- ing? III. Related work A. Biology The biological idea of primitives has a long history. Motor primitives date back at least to Helmholtz and are still receiving current interest. In biological context motion primitives are mostly referred to as Central Pattern Generators (CPG), emphasizing the fact of their autonomous nature. The decomposition of sensing into the activity of specialist primitives starts at the latest with the ecological approach of Gibson. In investigations of motor control, several authors describe a coherent picture of the features of biological control systems (Arbib 1981; Mussa-Ivaldi and Solla 2004; Grillner 2006; Ijspeert 2008). In summary, these features are their distributed character which results in intelligent behaviour from the ground up through component autonomy and goal-awareness at lowest levels due to motor- referenced sensory primitives; their subsumption of concepts such as regulators, feedback controllers, homeostasis and their mediating function in output transformation; presence of internal models coupled with identiﬁcation mechanisms to realize adaptive control; the ability to realize minute yet signiﬁcant adjustment which is in many cases necessary for successful actions. Overall, adaptability appears to be favored over precision. B. Proposed architectures for representing primitives Recently proposed computational models for primitive representation can be classiﬁed according to their location on the model-based to model-free spectrum. On the model-based end, Lupashin et al. (2010); Schoellig, Mueller, and D’Andrea (2012); Mellinger, Michael, and Kumar (2012) use a detailed quadrotor ﬂight dynamics model derived from ﬁrst principles which is then reﬁned iteratively. Behaviourial targets are trajectory following with varying DoF and aerobatics. Faust et al. (2013) learn swing-free trajectories on load-carrying quadrotors