A Comparison of Voice Controlled and Mouse Controlled Web Browsing Kevin Christian * kevin@cs.umd.edu Bill Kules *+ wmk@cs.umd.edu Ben Shneiderman *^ ben@cs.umd.edu Adel Youssef * adel@cs.umd.edu * Department of Computer Science ^ Human-Computer Interaction Laboratory + Takoma Software, Inc. University of Maryland at College Park Institute for Advanced Computer Studies 7006 Poplar Avenue College Park, MD 20742 Institute for Systems Research Takoma Park, MD 20912 University of Maryland at College Park College Park, MD 20742 ABSTRACT Voice controlled web browsers allow users to navigate by speaking the text of a link or an associated number instead of clicking with a mouse. One such browser is Conversa, by Conversational Computing. This within subjects study with 18 subjects compared voice browsing with traditional mouse-based browsing. It attempted to identify which of three common hypertext forms (linear slide show, grid/tiled map, and hierarchical menu) are well suited to voice navigation, and whether voice navigation is helped by numbering links. The study shows that voice control adds approximately 50% to the performance time for certain types of tasks. Subjective satisfaction measures indicate that for voice browsing, textual links are preferable to numbered links. Keywords Human-computer interaction, user interfaces, voice browsers, voice recognition, web browsing INTRODUCTION Information contained on the World Wide Web is inaccessible to many people. The web is primarily a visual medium that requires a keyboard and mouse to navigate, and this disenfranchises several types of users. People who lack motor skills to use a keyboard and mouse find navigation troublesome. Visually impaired users can not read the display. People who do not have access to an Internet-capable computer have difficulty even accessing the World Wide Web, and those who temporarily cannot use a traditional web browser (for example, because their eyes or hands are occupied or because they are not near their computer) are at a minimum inconvenienced. Speech recognition and generation technologies offer a potential solution to these problems by augmenting the capabilities of a web browser. A voice browser is a web browser with at least one of the following capabilities: Can render web pages in an audio format (speech generation) Can interpret spoken input for navigation (speech recognition) A number of voice browsers are on the market, and more are under development. Conversational Computing’s Conversa is a web browser that accepts speech input, but renders the pages in the traditional visual manner [18]. The Home Page Reader, from IBM, renders web pages in audio format, but accepts commands only via the keyboard’s number pad [20]. PipeBeach is a system that affords both audio rendering of web pages as well as speech input. LIASON, from Siemen’s, Inc., is a system designed for use while driving an automobile[25]. Systems specifically designed to accommodate telephone-based browsing include Lucent’s PhoneBrowser, Siemen’s DICE, and 1- 800-Hypertext [1,4,25]. Other systems are application- specific. VADAR, from BBN, allows users to track shipments over the world wide web, while Talk’n’Travel, also from BBN, is an interface for commercial-travel websites that allows users to access flight and train schedules [20]. The GALAXY project at MIT is a system that will access the web to find information in response to a user’s queries [21]. Users with temporary or permanent motor impairments stand to gain much from such products. A web browser that can render web pages in audio format will be of obvious use to the blind, and navigating by voice obviates the need for keyboard and mouse navigation. Additionally, people whose eyes and hands are otherwise engaged may still be able to conveniently access the web. For example, someone will be able to get directions via the web while driving their car. Voice browsers open up new possibilities for bringing the content of the web to a larger segment of the population. A voice browser potentially makes the telephone capable of Internet access. Since the number of households with telephones is far greater than number of households with internet-capable computers, it stands to reason that the