Acoustically aided HMI for ROV navigation ⋆ A. Vasilijevic * E. Omerdic ** B.Borovic * Z. Vukic * * University of Zagreb, Faculty of Electrical Engineering and Computing, Laboratory for Underwater Systems and Technologies. Unska 3, HR-10000 Zagreb, Croatia (e-mail: antonio.vasilijevic@fer.hr, bruno.borovic@fer.hr, zoran.vukic@fer.hr). ** University of Limerick, ECE Department, Mobile & Marine Robotics Research Centre, Limerick, Ireland) Abstract: Majority of ROVs are underwater vehicles with relatively slow dynamics virtually providing a ROV pilot extra time to perform other tasks, such as inspections and arm operation. However, with many tasks performed simultaneously with ﬂying the relevant information is typically dispersed on a number of screens overloading the pilot’s visual channel. Therefore, mishaps are likely to occur. To improve pilots’ perception of its surroundings and to unload their visual channel, a new method called AD AR (Audio Augmented Reality) which combines the concept of Augmented Reality with virtual audio-video user interface, is proposed. In this paper, the feasibility of proposed method used as an aiding tool for navigation is investigated. All experiments are performed on the state-of-the-art, realistic ROV simulator developed by Mobile & Marine Robotics Research Centre, ECE Department, University of Limerick, Ireland. Keywords: Remotely Operated Vehicle (ROV), Human Machine Interface (HMI), Path Following, Acoustically Aided Guidance, Audio Display, Augmented Reality, Head Related Transfer Function (HRTF), Virtual Target - Rabbit 1. INTRODUCTION ROV accidents are mostly caused by a human factor. ”There’s no tactile feedback, no depth perception, no audio feedback of what’s going on down there... There has been surprisingly little automated features and surprisingly minimal attention paid to human factors”, Bleicher (2010). The only real automatic controls present on modern ROVs are auto heading, auto depth and auto altitude. It really depends on pilot skills to do good piloting. On the other hand, a contemporary control room for a ROV operator is stuﬀed with screens presenting everything from video streaming from multiple cameras to various data acquired from multiple ROV subsystems. The in- formation is exclusively presented visually through the visual channel which possesses ability to present complex information content at a high rate, relying on humans high visual capability and ability to associate symbols with concepts and consequent actions. The pilot is often required to perform multiple tasks, e.g. piloting, inspection or search simultaneously and therefore enormous quantity of information may easily overload the ROV pilots’ visual channel and prevent them from perceiving all important information related to the particular task, especially if it is complex. In ROV applications these issues, i.e. disper- ⋆ This research was made possible by the EU 2008-1-REGPOT grant for the ”Developing the Croatian Underwater Robotics Re- search Potential” project, grant agreement no. 229553, and by the non-government organization Center for Underwater Systems and Technologies (CUST). sion of relevant information, overloading of visual channel and operator multitasking, are recognized as a signiﬁcant problem often resulting in failed missions or even mishaps. We thus propose a novel Augmented Reality (AR) based approach which expands the standard video user interface with 3D audio display. The aim of proposed method is to support ROV operations, improve operator’s perception of the underwater environment and, by using the spatial sound cues, to reduce ”eye-oﬀ-the-road” time for opera- tors. Unlike many other known AR systems, our approach does not overlaps the real scene with graphics but with virtual 3D sound cues, Bellotti et al. (2002). We address the approach by combining two diﬀerent concepts, AR and audio representation of feedback signal. AR expands and hence improves our perception of reality with otherwise hidden, not previously known or visualised geo-referenced data in real time. In case of most common ”visual AR concept”, the reality becomes augmented with data supplied from a variety of diﬀerent sources such as sensors, systems, cameras and sonars, with content presented as a new information layer or simply spread around the point of interest on the screen. All the relevant information that exists suddenly becomes part of our decision making process. Introduction of a display using one of otherwise unused human senses, most commonly touch (haptic display) or, as in our case, hearing (auditory display), results in both unloading of operators’ visual channel and, at the same time, beneﬁting from additional advantages speciﬁc to