1 Introduction Previous research has shown that the perception of auditory speech can be influenced strongly by the appearance of a talker's face. For example, seeing the face of a talker can improve intelligibility in a noisy environment (Erber 1969; MacLeod and Summerfield 1987; Middleweerd and Plomp 1987; Sumby and Pollack 1954) and help recover a difficult message even when the auditory signal is clear (Reisberg et al 1987). The McGurk effect (McGurk and MacDonald 1976) further demonstrates the influence of visual speech. Different auditory and visual inputs combine to form an illusory percept not present in either modality alone. For example, when auditory /ba/ is presented in synchrony with the facial movements /ga/, observers often report a syllable that is a fusion of both sources of information (eg `da' or `tha'). McGurk effects occur when the auditory signal is clear and unambiguous and when observers are aware of the dubbing process (Liberman 1982). It is commonly (and logically) assumed that basic visual cues must be encoded for visual speech to be recognised and to affect auditory speech recognition in face-to- face communications (eg Benoit et al 1996; Brooke and Summerfield 1983; Campbell et al 1998; Cohen et al 1996; Marrasa and Lansing 1995; Massaro 1998; Massaro and Cohen 1990; Montgomery and Jackson 1983; Summerfield and McGrath 1984; Summerfield et al 1989). Conflicting accounts exist of how auditory and visual signals are transformed and integrated into a linguistic or phonetic code (eg Massaro 1987, 1998; Schwartz et al 1998; Summerfield 1987), but these models do not focus on the precise nature of the visual parameters that specify these inputs. Recent findings sug- gest that detailed information (eg lip shape, tongue position and shape, and visibility of teeth) is not critical for visual and audiovisual speech recognition (Campbell and The role of facial colour and luminance in visual and audiovisual speech perception Perception, 2003, volume 32, pages 921 ^ 936 Maxine V McCotterô, Timothy R Jordan School of Psychology, University of Nottingham, University Park, Nottingham NG7 2RD, UK; e-mail: maxine@psy.gla.ac.uk Received 3 December 2001, in revised form 29 January 2003; published online 29 August 2003 Abstract. We conducted four experiments to investigate the role of colour and luminance infor- mation in visual and audiovisual speech perception. In experiments 1a (stimuli presented in quiet conditions) and 1b (stimuli presented in auditory noise), face display types comprised naturalistic colour (NC), grey-scale (GS), and luminance inverted (LI) faces. In experiments 2a (quiet) and 2b (noise), face display types comprised NC, colour inverted (CI), LI, and colour and luminance inverted (CLI) faces. Six syllables and twenty-two words were used to produce auditory and visual speech stimuli. Auditory and visual signals were combined to produce congruent and incongruent audiovisual speech stimuli. Experiments 1a and 1b showed that perception of visual speech, and its influence on identifying the auditory components of congruent and incongruent audiovisual speech, was less for LI than for either NC or GS faces, which produced identical results. Experi- ments 2a and 2b showed that perception of visual speech, and influences on perception of incongruent auditory speech, was less for LI and CLI faces than for NC and CI faces (which produced identical patterns of performance). Our findings for NC and CI faces suggest that colour is not critical for perception of visual and audiovisual speech. The effect of luminance inversion on performance accuracy was relatively small (5%), which suggests that the luminance information preserved in LI faces is important for the processing of visual and audiovisual speech. DOI:10.1068/p3316 ô Correspondence concerning this article should be addressed to Maxine McCotter, Psychology Department, University of Glasgow, Glasgow G12 8QB, Scotland, UK.