Semi-parametric Approaches to Learning in Model-Based Hierarchical Control of Complex Systems Munzir Zafar, Areeb Mehmood, Mouhyemen Khan, Shimin Zhang, Muhammad Murtaza, Victor Aladele, Evangelos A. Theodorou, Seth Hutchinson, and Byron Boots Georgia Institute of Technology, Atlanta GA 30332, USA Abstract. For systems with complex and unstable dynamics, such as humanoids, the use of model-based control within a hierarchical frame- work remains the tool of choice. This is due to the challenges associ- ated with applying model-free reinforcement learning on such problems, such as sample inefficiency and limits on exploration of state space in the absence of safety/stability guarantees. However, relying purely on physics-based models comes with its own set of problems. For instance, the necessary limits on expressiveness imposed by committing to fixed basis functions, and consequently, their limited ability to learn from data gathered on-line. This gap between theoretical models and real-world dy- namics gives rise to a need to incorporate a learning component at some level within the model-based control framework. In this work, we present a highly redundant wheeled inverted-pendulum humanoid as a testbed for experimental validation of some recent approaches proposed to deal with these fundamental issues in the field of robotics, such as: 1. Semi- parametric Gaussian Process-based approaches to computed-torque con- trol of serial robots [1] 2. Probabilistic Differential Dynamic Program- ming framework for trajectory planning by high-level controllers [2, 3] 3. Barrier Certificate based safe-learning approaches for data collection to learn the dynamics of inherently unstable systems [4]. We discuss how a typical model-based hierarchical control framework can be extended to incorporate approaches for learning at various stages of control design and hierarchy, based on the aforementioned tools. Keywords: Hierarchical Control, Model-based Control, Wheeled In- verted Pendulum Humanoids, Semi-Parametric Model, Safe learning, Probabilistic Trajectory Optimization 1 Introduction Wheeled inverted pendulum (WIP) systems offer fast and efficient locomotion along with the ability to deal with very heavy payloads. This ability allows them to compensate large external forces by readily adjusting their center of mass (CoM). Golem Krang [6] is a tree-structured serial robot with two serial