578 IEICE TRANS. INF. & SYST., VOL.E88–D, NO.3 MARCH 2005 LETTER Special Section on Corpus-Based Speech Technologies CIAIR In-Car Speech Corpus —— Inﬂuence of Driving Status —— Nobuo KAWAGUCHI †a) , Shigeki MATSUBARA † , Kazuya TAKEDA † , and Fumitada ITAKURA †† , Members SUMMARY CIAIR, Nagoya University, has been compiling an in-car speech database since 1999. This paper discusses the basic information contained in this database and an analysis on the eﬀects of driving status based on the database. We have developed a system called the Data Col- lection Vehicle (DCV), which supports synchronous recording of multi- channel audio data from 12 microphones which can be placed throughout the vehicle, multi-channel video recording from three cameras, and the col- lection of vehicle-related data. In the compilation process, each subject had conversations with three types of dialog system: a human, a “Wizard of Oz” system, and a spoken dialog system. Vehicle information such as speed, engine RPM, accelerator/brake-pedal pressure, and steering-wheel motion were also recorded. In this paper, we report on the eﬀect that driving status has on phenomena speciﬁc to spoken language key words: speech corpus, in-car speech, ITS 1. Introduction The Center for Integrated Acoustic Information Research (CIAIR) has been compiling a database of in-car speech and dialog since 1999. This has been done with the goals of achieving robust speech recognition in actual us- age environments and improving the level of spoken dia- log [1]–[7]. At CIAIR, we have also constructed a spe- cialized speech database recording vehicle (Fig. 1), and have recorded multi-modal information including speech and video, as well as information regarding vehicle oper- ation and position, using more than 800 subjects. (Details on the recording methods and equipment have been given elsewhere [5].) In this paper, we report on this in-car speech database and recording vehicle, and show how this data can be used to analyze the eﬀects of driving status on spo- ken language phenomena which reﬂect the mental focus of drivers. The unique characteristic of this speech database is that it was compiled while subjects were actually driving the ve- hicle, so the dialog was recorded under diﬀerent conditions than in the case of normal spoken dialog. In recordings start- ing from 2000, we recorded dialog using a system based on the Wizard of Oz method [6] and a spoken dialog system, as well as a human operator who played the part of a mechan- ical system, with each of these in-car information systems acting as the second party in the dialog. Manuscript received July 1, 2004. Manuscript revised September 10, 2004. † The authors are with Nagoya University, Nagoya-shi, 464– 8601 Japan. †† The author is with Meijyo University, Nagoya-shi, 468–8502 Japan. a) E-mail: kawaguti@nagoya-u.jp DOI: 10.1093/ietisy/e88–d.3.578 2. Construction of the In-Car Speech Database The goal of this recording was to gather data while the sub- ject actually drove the vehicle in a real driving environ- ment. Table 1 is an outline of the sessions recorded for each subject. In 1999, about 11 minutes of spoken dialog with a human operator (HUM) was recorded for each subject. (These dialogs have been analyzed elsewhere[6].) From 2000 onwards, we introduced spoken dialog with a Wizard of Oz system (WOZ) and a spoken dialog system (SYS) [6] to achieve more realistic recordings. We made ﬁve-minute Fig. 1 CIAIR data collection vehicle. Table 1 Collected speech data. Copyright c  2005 The Institute of Electronics, Information and Communication Engineers