An Intelligent Virtual Meeting App for Seamlessly Polling Virtual Participants “On-the-Fly” with Nonverbal Communication Cues Jung In Koh Texas A&M University College Station, USA jungin@tamu.edu Samantha Ray Texas A&M University College Station, USA sjr45@tamu.edu Josh Cherian Texas A&M University College Station, USA jcherian14@tamu.edu Paul Taele Texas A&M University College Station, USA ptaele@tamu.edu Tracy Hammond Texas A&M University College Station, USA hammond@tamu.edu ABSTRACT The prevalence of virtual meeting software and webcams have normalized holding group meetings both remotely and distributed. However, virtual meetings reduce viewing to computing displays and listening to unidirectional speakers, which disrupt real-life social cues for activities such as informally polling participants łon-the-fyž. This paper proposes an intelligent virtual meetings app called Show of Hands, which leverages nonverbal communi- cation cues to spontaneously poll virtual participants. The app recognizes virtual participants’ intuitive real-time hand gestures to express intended polling selections, and then displays in real- time highly-visible video flters that overlay participants’ camera views and a visual chart of the aggregated polling counts. Our work benefts virtual participants in seamlessly conducting spontaneous polls to gauge opinions or check knowledge of attendees without prior preparation, expressing poll responses using familiar physical gestures, and being better informed of poll results with both a distributed view of video flters and a focused chart visualization. CCS CONCEPTS · Human-centered computing → Interactive systems and tools; User studies; Mixed / augmented reality; Gestural input. KEYWORDS hand gestures, virtual meetings, impromptu polling, educational interfaces, gesture elicitation ACM Reference Format: Jung In Koh, Samantha Ray, Josh Cherian, Paul Taele, and Tracy Hammond. 2023. An Intelligent Virtual Meeting App for Seamlessly Polling Virtual Participants łOn-the-Flyž with Nonverbal Communication Cues. In 28th International Conference on Intelligent User Interfaces (IUI ’23), March 27–31, 2023, Sydney, Australia. ACM, New York, NY, USA, 4 pages. https://doi.org/ 10.1145/3581754.3584141 Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proft or commercial advantage and that copies bear this notice and the full citation on the frst page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). IUI ’23, March 27–31, 2023, Sydney, Australia © 2023 Copyright held by the owner/author(s). ACM ISBN 979-8-4007-0107-8/23/03. https://doi.org/10.1145/3581754.3584141 1 INTRODUCTION The recent demand for and interest in synchronous online classes and alternative professional meeting opportunities has led to the growing reliability and ubiquity of video conferencing technology. As a result, people can now attend meetings both remotely and virtually while still doing many of the same tasks they would in a traditional in-person meeting. As an example, impromptu polling is an interaction activity where a host can gauge the opinions of atten- dees by collecting and interpreting their prompted responses so that they are better informed of the virtual meeting participants’ state of mind and so that attendees feel more engaged and empowered because their feedback is heard. However, conducting impromptu polling in virtual meetings poses certain challenges that are not present in traditional in-person interactions. That is, unlike in-person meeting environments where attendees are co-located such that hosts can more immediately gauge the collective opinions of their peers through visual cues (e.g., raising of hands, performing hand gestures, facial expres- sions) and audio cues (e.g., verbal responses, group volume levels), the lack of co-located presence in remote environments makes such visual and audio cues more difcult to gauge through video conferencing technology [1, 9]. Current online polling websites such as Poll Everywhere [2] and Kahoot! [6] provide solutions that allow hosts to conduct structured polling in a more seamless and entertaining way, respectively, but such solutions are constrained to polls that are prepared in advance of the meetings (e.g., online quizzes, formal surveys). Furthermore, developers of popular video conferencing software such as Zoom [10], Google Meet [3], and Microsoft Teams [8] have been motivated by the need to bridge the gap between traditional in-person meetings and virtual remote meetings, introducing engaging visual communication markers (e.g., text messaging emoji, video flters) to supplement users’ live- streamed body cues. Some of these tools have begun developing hand gesture recognition for common gestures such as raising a hand and thumbs up. However, such approaches still require that hosts manually view these digital visual cues. The cognitive load required for comprehending this feedback goes signifcantly up as the size of the meeting audience increases. The challenge of this situation grows when considering that A) attendees may choose to turn of their microphones and/or cameras at any point during the meeting, and B) all attendees may not be visible on the host’s screen at once, being spread across pages of small video windows. 95