An Intelligent Virtual Meeting App for Seamlessly Polling Virtual
Participants “On-the-Fly” with Nonverbal Communication Cues
Jung In Koh
Texas A&M University
College Station, USA
jungin@tamu.edu
Samantha Ray
Texas A&M University
College Station, USA
sjr45@tamu.edu
Josh Cherian
Texas A&M University
College Station, USA
jcherian14@tamu.edu
Paul Taele
Texas A&M University
College Station, USA
ptaele@tamu.edu
Tracy Hammond
Texas A&M University
College Station, USA
hammond@tamu.edu
ABSTRACT
The prevalence of virtual meeting software and webcams have
normalized holding group meetings both remotely and distributed.
However, virtual meetings reduce viewing to computing displays
and listening to unidirectional speakers, which disrupt real-life
social cues for activities such as informally polling participants
łon-the-fyž. This paper proposes an intelligent virtual meetings
app called Show of Hands, which leverages nonverbal communi-
cation cues to spontaneously poll virtual participants. The app
recognizes virtual participants’ intuitive real-time hand gestures
to express intended polling selections, and then displays in real-
time highly-visible video flters that overlay participants’ camera
views and a visual chart of the aggregated polling counts. Our work
benefts virtual participants in seamlessly conducting spontaneous
polls to gauge opinions or check knowledge of attendees without
prior preparation, expressing poll responses using familiar physical
gestures, and being better informed of poll results with both a
distributed view of video flters and a focused chart visualization.
CCS CONCEPTS
· Human-centered computing → Interactive systems and
tools; User studies; Mixed / augmented reality; Gestural input.
KEYWORDS
hand gestures, virtual meetings, impromptu polling, educational
interfaces, gesture elicitation
ACM Reference Format:
Jung In Koh, Samantha Ray, Josh Cherian, Paul Taele, and Tracy Hammond.
2023. An Intelligent Virtual Meeting App for Seamlessly Polling Virtual
Participants łOn-the-Flyž with Nonverbal Communication Cues. In 28th
International Conference on Intelligent User Interfaces (IUI ’23), March 27–31,
2023, Sydney, Australia. ACM, New York, NY, USA, 4 pages. https://doi.org/
10.1145/3581754.3584141
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
IUI ’23, March 27–31, 2023, Sydney, Australia
© 2023 Copyright held by the owner/author(s).
ACM ISBN 979-8-4007-0107-8/23/03.
https://doi.org/10.1145/3581754.3584141
1 INTRODUCTION
The recent demand for and interest in synchronous online classes
and alternative professional meeting opportunities has led to the
growing reliability and ubiquity of video conferencing technology.
As a result, people can now attend meetings both remotely and
virtually while still doing many of the same tasks they would in a
traditional in-person meeting. As an example, impromptu polling is
an interaction activity where a host can gauge the opinions of atten-
dees by collecting and interpreting their prompted responses so that
they are better informed of the virtual meeting participants’ state
of mind and so that attendees feel more engaged and empowered
because their feedback is heard.
However, conducting impromptu polling in virtual meetings
poses certain challenges that are not present in traditional in-person
interactions. That is, unlike in-person meeting environments where
attendees are co-located such that hosts can more immediately
gauge the collective opinions of their peers through visual cues
(e.g., raising of hands, performing hand gestures, facial expres-
sions) and audio cues (e.g., verbal responses, group volume levels),
the lack of co-located presence in remote environments makes
such visual and audio cues more difcult to gauge through video
conferencing technology [1, 9]. Current online polling websites
such as Poll Everywhere [2] and Kahoot! [6] provide solutions that
allow hosts to conduct structured polling in a more seamless and
entertaining way, respectively, but such solutions are constrained
to polls that are prepared in advance of the meetings (e.g., online
quizzes, formal surveys). Furthermore, developers of popular video
conferencing software such as Zoom [10], Google Meet [3], and
Microsoft Teams [8] have been motivated by the need to bridge
the gap between traditional in-person meetings and virtual remote
meetings, introducing engaging visual communication markers
(e.g., text messaging emoji, video flters) to supplement users’ live-
streamed body cues. Some of these tools have begun developing
hand gesture recognition for common gestures such as raising a
hand and thumbs up. However, such approaches still require that
hosts manually view these digital visual cues. The cognitive load
required for comprehending this feedback goes signifcantly up as
the size of the meeting audience increases. The challenge of this
situation grows when considering that A) attendees may choose
to turn of their microphones and/or cameras at any point during
the meeting, and B) all attendees may not be visible on the host’s
screen at once, being spread across pages of small video windows.
95