Proceedings of the International Multiconference on ISBN 978-83-60810-22-4 Computer Science and Information Technology, pp. 497 – 502 ISSN 1896-7094 Abstract—This paper presents a new methodology of con- ducting the audio-visual correlation analysis employing the gaze tracking system. Interaction between two perceptual modalities, seeing and hearing, their interaction and mutual reinforcement in a complex relationship was a subject of many research studies. Earlier stage of the carried out experiments at the Multimedia Systems Department (MSD) showed that there exists a relationship between perception of video pre- sented in the screen and accompanying audio signals, both stereo and spatial. These results were based on subjective tests. Applying the gaze tracking system to the subjective do- main may be a step towards objectivization of results obtained during such tests. In the paper, first a short review of audio-visual correlation examination methods is presented. Then, a gaze tracking sys- tem engineered at the MSD is shortly presented. The system calibration is also shown. Assumptions of the preliminary ex- periments are outlined, and the realization of some prelimi- nary tests is described. Conclusions concerning the objective methodology of audio-visual correlation analysis proposed are also included. I. INTRODUCTION HIS paper presents an innovative method of analysis of correlations between stereo sound and video. Many sci- entists have researched the audio-visual correlation for years. Research carried out at the Multimedia Systems De- partment (MSD) – conducted during the last decade–en- abled to confirm that visual objects could influence the sub- jective localization of sound sources [1-3][5-7][10-11][15- 18]. T The innovative method described in the paper is associat- ed with employing the gaze tracking system developed at the MSD. Gaze tracking is an objective technique which may enable to obtain credible outcomes of audio-visual cor- relation analysis in a non-invasive way. Experiments with gaze tracking technique consist in the determination of the part of the screen on which the user is looking and in com- paring it with the content of the video image. It is worth mentioning that gaze tracking systems are often used for Research funded within the project No. POIG.01.03.01-22-017/08, enti- tled "Elaboration of a series of multimodal interfaces and their implementation to educational, medical, security and industrial applications". The project is subsidized by the European regional development and fund by the Polish State budget". such tasks as checking the user attention [4]. That is why it seems valuable to employ such a system to the domain of subjective testing, where the reliability of so called experts is of a great importance. The description of the preliminary tests carried out with the gaze tracking system applied to the domain of audio-visual correlation is given. Also, the calibration process of the gaze tracking system is shortly de- scribed and conclusions are provided. II.REVIEW OF AUDIO-VISUAL CORRELATION METHODS A. First experiments Experiments concerning audio-visual perception have been carried out since the 19th century. First results pub- lished by Stratton, showed that ability of localization of sound sources depended on visual cues. The participants of these tests were watching the video in vertically-flipping glasses and then were localizing the sound sources [18]. B. Later research Research conducted by Witkin et al. in 1952 allowed to test the influence of the view of announcer’s face on local- ization of his/her voice [24]. Tests proved that people deter- mine the direction of the heard voice as coming from the center when the announcer’s face was seen. In the same case the tested claimed that the voice comes from the side when their eyes were closed. The research by Witkin proved that the so-called ‘image proximity effect’ exists. This effect consists in perceiving the sound source which is shifted to- wards the existing image in relation to the perceived place when the image is not displayed [20]. Other researchers conducted the experiments which con- firmed observations made by Stratton. Thomas [22] proved that visual cues do not need to be directly related to sound. He used in his experiments lamplight and bell sound. De velopment of television caused demand on new research studies –associated with localization of stereo sound under the influence of the television screen. C. Research dedicated to television Komiyayama and Nakabayashi researched the influence of the large-format image (on television) on perceiving the direction of sound in the vertical plane [2]. The image of 497 An New Method of Audio-Visual Correlation Analysis Bartosz Kunka Gdansk University of Technology Multimedia Systems Department ul. Narutowicza 11/12, 80-233 Gdansk, Poland email: kuneck@sound.eti.pg.gda.pl Bozena Kostek Gdansk University of Technology Multimedia Systems Department ul. Narutowicza 11/12, 80-233 Gdansk, Poland email: bozenka@sound.eti.pg.gda.pl