Integrating Multi-Modal Interfaces to Command UAVs [Video Abstract] Valiallah (Mani) Monajjemi, Shokoofeh Pourmehr, Seyed Abbas Sadat, Fei Zhan Jens Wawerla, Greg Mori and Richard Vaughan School of Computing Science, Simon Fraser University, Burnaby, BC, Canada {mmonajje, spourmeh, sas21, fzhan, jwawerla, mori, vaughan}@sfu.ca ABSTRACT We present an integrated human-robot interaction system that enables a user to select and command a team of two Unmanned Aerial Vehicles (UAV) using voice, touch, face engagement and hand gestures. This system integrates mul- tiple human [multi]-robot interaction interfaces as well as a navigation and mapping algorithm in a coherent semi- realistic scenario. The task of the UAVs is to explore and map a simulated Mars environment. To initiate a mission, the user needs to select a robot. To do this, We used the“Touch-To-Name”selection and naming interface [3]. In this method, the user first announces the desired number of robot(s) (e.g “You” or “You Two” ), then gently moves intended robot(s) iteratively. Robots compare their accelerometer readings over Wi-Fi to agree on which one is selected. Once selected, the user names the selected robot using verbal commands (e.g “You are Green” ). These names are then used to command the robots (e.g. “Green Takeoff” ) [4]. Here, we use this interface with maximum group size set to one. After taking off and while hovering, robot looks for hu- man faces in its camera feed. When user’s face is detected, the robot continuously controls its altitude and heading di- rection to face the user. A hand wave gesture (left or right) assigns an exploration task to the robot in the indicated di- rection. We used the method described in [2] for face track- ing and gesture recognition. While exploring, each robot performs vision-based Simul- taneous Localization and Mapping (SLAM) using their on- board monocular camera [1]. We used the“Feature-rich path planning algorithm” introduced in [5] to robustly navigate a UAV while exploring an unknown environment. To ter- minate the mission, the user commands each robot to come back home (e.g “Green come back” ). To come back, robots use the same algorithm to plan a feature-rich path to their takeoff position. Finally, The user asks robots to land. (e.g. “Green land” ). Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage, and that copies bear this notice and the full ci- tation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). Copyright is held by the author/owner(s). HRI’14, March 3–6, 2014, Bielefeld, Germany. ACM 978-1-4503-2658-2/14/03. http://dx.doi.org/10.1145/2559636.2559646 . The system provides two types of feedback to the user during interaction sessions and mission execution. Robots change the color and blinking pattern of their LED lights to inform the user about their state (e.g. “tracking user’s face”, “exploring” or “being idle”). In addition, a text-to-speech (TTS) engine provides verbal feedback to the user whenever a robot’s state changes. As an example, when the Green robot is asked by the user to comeback, it acknowledges by saying “Green is coming back” . The TTS is embedded within a general purpose web-based robot monitoring dashboard. We used Parrot AR-Drone 2.0 quadrocopter as UAV plat- form in our system. All described software components run off-board on two commodity Intel Core i7 notebooks (one dedicated to each robot). The computers are connected to UAVs via Wi-Fi connection. The video shows a complete run-through of a two robot exploration mission in which the HRI worked perfectly. Categories and Subject Descriptors H.5.2 [User Interfaces and Presentation]: Robotics, Hu- man Multi-Robot Interfaces, Interaction styles 1. REFERENCES [1] G. Klein and D. Murray. Parallel tracking and mapping for small AR workspaces. In Proc. Sixth IEEE and ACM Int. Symp. on Mixed and Augmented Reality (ISMAR’07), Nara, Japan, November 2007. [2] V. Monajjemi, J. Wawerla, R. T. Vaughan, and G. Mori. HRI in the sky: Creating and commanding teams of UAVs with a vision-mediated gestural interface. In Proc. of Int. Conf. on Intelligent Robots and Systems, 2013. [3] S. Pourmehr, V. Monajjemi, S. A. Sadat, F. Zhan, J. Wawerla, G. Mori, and R. Vaughan.“You are Green”: a Touch-to-Name interaction in an integrated multi-modal multi-robot HRI system. In Int. Conf. on Human-Robot Interaction (HRI), 2014 (submitted). [4] S. Pourmehr, V. Monajjemi, R. T. Vaughan, and G. Mori. You two! take off! creating, modifying and commanding groups of robots using face engagement and indirect speech in voice commands. In Proc. of Int. Conf. on Intelligent Robots and Systems, 2013. [5] S. A. Sadat, K. Chutskoff, D. Jungic, J. Wawerla, and R. Vaughan. Feature-rich path planning for robust navigation of MAVs with mono-SLAM. In Int. Conf. on Robotics and Automation (ICRA), 2014 (submitted). 106