EYE GAZE AND SPEECH FOR DATA ENTRY: A COMPARISON OF DIFFERENT DATA
ENTRY METHODS
Yeow Kee Tan, Nasser Sherkat, Tony Allen
Nottingham Trent University, Department of Computing and Mathematics
Burton Street, Nottingham
NG1 4BU, United Kingdom
[yeow.tan, nasser.sherkat, tony.allen]@ntu.ac.uk
ABSTRACT
In this paper we present a multimodal interface that employs
speech recognition and eye gaze tracking technology for use in
data entry tasks. The aim of this work is to compare the
usability of this multimodal system against other data entry
methods (handwriting, mouse and keyboard and speech only)
when carrying out the data entry task of filling a form.
Discussions regarding the relationships between efficiency,
effectiveness, ergonomic quality, hedonic quality, naturalness,
familiarity and users preference are presented. The
experimental results show that the majority of the users prefer
using the proposed eye and speech system compared to the
other form-filling methods even though such a method is neither
the fastest nor the most accurate.
1. INTRODUCTION
Although research into multimodal interfaces can be dated back
to 1980, research into interfaces that combine active (e.g.
speech recognition) and passive (e.g. eye gaze) inputs is still in
its infancy and needs to be explored further in order to identify
new combinations that can improve the human-computer
interaction. This type of interface is known as a blended style
interface and typical examples include IBM’s Manual And
Gaze Input Cascaded pointing system [1], that allows the use of
mouse and eye input to select the icons on the desktop, and the
combination of facial and speech recognition [2] that is able to
reduce word error rate by 27% in a noisy environment. These
positive results have led us to research combining eye gaze and
speech for data entry [3].
An experiment has been carried out in order to compare the
usability of the proposed multimodal interface against other
data entry methods. The aspects of efficiency, effectiveness and
user preference have been evaluated. In addition to these
common usability aspects, aspects such as easy to use, fun to
use, familiarity and naturalness have also been observed in
order to achieve a more extensive usability study.
According to Jordan, “Usability as a concept does not seem to
include (positive) feelings such as pride, excitement or
surprise” [4]. These positive emotions, mentioned by Jordan,
can be classified as Hedonic Quality (HQ). HQ refers to quality
attributes with no obvious relation to task/goal-fulfillment, i.e.
“original”, “innovative”, “exciting”, or “exclusive”. These
attributes address the human needs for novelty/change (i.e.
excitement) and social power (i.e. status, pride) [5]. It is to be
noted that fun to use differs from easy to use. Carroll and
Thomas [6] argued that ease of use implies simplicity, which in
turn is partly incompatible with fun. For example by making a
system easy to use, there is a chance that it will be boring as
well.
After years of debate, the Human-Computer Interaction (HCI)
research community is now gradually accepting the concept of
joy and fun as an important factor in usability [7]. Based on the
findings in [8], perceived fun has a stronger effect on user
satisfaction than perceived usefulness. It was also observed that
user satisfaction will lead to an increased time-spent with a
software system. This, in turn, may cause the user to use the
system more frequently thereby gaining a better understanding
of it (increased familiarity). However by introducing the factor
of fun (HQ), the simplicity of a system may also decrease [5].
The question of whether using HQ as one of the main factors in
system development is still a debating issue in the HCI
community.
In this paper, naturalness is defined as the regular way, by
which one human being would pass information to another
human being (e.g. Person A tell his/her name to person B).
Familiarity, on the other hand, is defined as the most common
way a particular task (in this case a data entry task of filling a
form) is carried out. Note that naturalness and familiarity are
different in this case. Naturalness relates more to whether the
user experiences a human-to-human input style whilst
interacting with the system. Familiarity is concerned more with
the methods the user has previously used for carrying out a
particular task.
2. SYSTEM IMPLEMENTATION
The experiment in this paper compares the use of handwriting
(HW), mouse and keyboard (MK), speech only (SO) and eye
and speech (ES) for a specific data entry task. The ES and SO
systems use the same interface layout (figure 1). This allows
users to fill-out a form that consists of the 8 fields (television
type, title, surname, initials, house number, street name, city
I - 41 0-7803-7965-9/03/$17.00 ©2003 IEEE ICME 2003
➠ ➡