An Information Acquiring Channel —— Lip Movement Xiaopeng Hong, Hongxun Yao, Qinghui Liu, Rong Chen Department of Computer Science and Engineering, Harbin Institute of Technology Harbin, 150001, China {xphong, yhx, qhliu, rchen}@vilab.hit.edu.cn Abstract. This paper is to prove that lip-movement is an available channel for information acquiring. The reasoning is given by describing two kinds of valid applications, which are constructed on lip movement information only. One is lip-reading, the other is lip-movement utterance recognition. The accuracy of the former system with speaker-dependent could achieve 68%, and of the latter achieves over 99.5% for test- independent (TI) and nearly 100% for test-dependent (TD) in experiments till now. From this conclusion, it could be easily got that lip-reading channel is an effective one and can be applied independently. 1. Introduction The role of labial channel appears out in computer vision field gradually. Speech recognition using both audio and video information has been demonstrated to have a better performance than using audio channel alone [1-3, 7-9] . Some famous automatic speech systems has existed, such as the audio-visual speech recognition (AVSR) systems, Intel Co. [7] and the system introduced by Neti [8] . Also, acoustic and labial speaker recognition and identification also have been shown to work [5, 6] . Although most of the systems using the labial channel which have existed now are integrated with acoustic information, applications using information from lip movement alone are still expected. At least two such applications: lip-reading and utterance recognition, are possible (Fig.1). Lip-reading is to recognition what the people say while utterance recognition to determine who makes the lip movement (we call him/her lip actor in this paper). Take utterance recognition for example. Face recognition research has shown some promising results [9] . As we all know, the mouth or lip area occupies a large part of the human face so that most of the facial information can be gained from this area. Meanwhile, lip movement offers dynamic information which can make the recognition result more stable and robust. Because of these, an utterance recognition system using lip movement information alone is available. This paper describes two systems which we developed: a lip-reading system and a lip movement utterance recognition system to show the validity of the sole labial channel, which can be seen from the experimental results. Thus the rest of this paper describes the implement of the two kinds of system, and model the lip movement