Comparing Elicited Gestures to Designer-Created Gestures for Selection above a Multitouch Surface Dmitry Pyryeskin, Mark Hancock, Jesse Hoey University of Waterloo, Ontario, Canada {dpyryesk, mark.hancock, jhoey}@uwaterloo.ca ABSTRACT Many new technologies are emerging that make it possible to extend interaction into the three-dimensional space di- rectly above or in front of a multitouch surface. Such tech- niques allow people to control these devices by performing hand gestures in the air. In this paper, we present a method of extending interactions into the space above a multitouch surface using only a standard diffused surface illumination (DSI) device, without any additional sensors. Then we fo- cus on interaction techniques for activating graphical widg- ets located in this above-surface space. We have conducted a study to elicit gestures for above-table widget activation. A follow-up study was conducted to evaluate and compare these gestures based on their performance. Our results showed that there was no clear agreement on what gestures should be used to select objects in mid-air, and that perfor- mance was better when using gestures that were chosen less frequently, but predicted to be better by the designers, as opposed to those most frequently suggested by participants. Author Keywords Multimodal interaction; natural human computer interac- tion; surface computing; multi-touch; gestures; hoverspace. ACM Classification Keywords H.5.2 [Information interfaces and presentation]: User Inter- faces. - Graphical user interfaces. General Terms Human Factors; Design; Performance; Measurement; Ex- perimentation. INTRODUCTION Multi-touch technology was conceived at least as early as 1965 [14], and since then has slowly become more reliable, accurate and commonplace. Nowadays, devices equipped with multi-touch screens are becoming ubiquitous on phones and tablets, and are being researched heavily on larger surfaces, such as tables and walls. This shift provides the potential for direct interaction with on-screen objects in a fashion familiar from the physical world [1,10,26]. Recent technology, such as the Microsoft Kinect, has made cheaper the possibility of extending this physical interaction into hover space—the space above or in front of a multi- touch display. The addition of hover space input to touch input can provide another mode of interaction, while allow- ing smooth transitions from one mode to another [17]. This added dimension in the interaction space can be used for a variety of purposes, for instance to manipulate 3D artifacts [13], to provide shortcuts to applications via Hover Widgets [8], or to create occlusion-aware interfaces [24]. While this design space is promising, one of the most com- pelling aspects of direct touch interaction is the clear and understandable way in which on-screen targets can be se- lected—by touching them with your hands or fingers. This physicality, however, is lost in hover space, and it becomes no longer clear how digital artifacts can and should be se- lected. Will people expect to be able to grab objects in mid- air, point at objects from a distance, or will they understand the need to dwell over a 3D target to select it (for example)? Currently, little work has explored what gestures people expect to be able to use to select targets above a table. In this work we study target selection in this space, with re- spect to both people’s expectations and performance. In this paper, we explore interaction in hover space by fo- cusing specifically on item selection in the space above a multi-touch surface. We first present the design of a system that can approximate the height of hands above a diffused surface illumination (DSI) device. We then present the re- sults of a pair of studies: in the first, we elicit what gestures people expect to be able to use to select on-screen targets in hover space, and in the second we explore the performance of the gestures chosen from the first study compared to sev- eral of our own designs for selection. Some of the gestures identified in our first study were beyond the capability of our hardware system, though might be possible with addi- tional hardware (e.g., a separate motion tracking system). Thus, the focus of our second study was on evaluating the performance of gestures that were practical to implement with minimal hardware. Our results show that not only do people disagree about how to select objects in this space, but also that the less-frequently chosen designs that we (the designers) predicted to perform better, in most cases did, when compared to the most frequently chosen gestures from the first study. RELATED WORK In this section we focus on four related areas: detecting movement above a surface, interaction in front of a surface, how others study gestures, and in-air target selection. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ITS’12, November 11–14, 2012, Cambridge, Massachusetts, USA. Copyright 2012 ACM 978-1-4503-1209-7/12/11...$15.00.