ACEEE Int. J. on Signal & Image Processing, Vol. 02, No. 01, Jan 2011 31 © 2011 ACEEE DOI: 01.IJSIP.02.01.231 Motion Vector Recovery for Real-time H.264 Video Streams Kavish Seth 1 , Tummala Rajesh 1 , V. Kamakoti 2 , and S. Srinivasan 1 1 Dept. of Electrical Engg., Indian Institute of Technology Madras, Chennai, India Email: {kavishseth, trajeshreddy}@gmail.com , srini@ee.iitm.ac.in 2 Dept. of Computer Sc. and Engg., Indian Institute of Technology Madras, Chennai, India Email: veezhi@gmail.com Abstract— Among the various network protocols that can be used to stream the video data, RTP over UDP is the best to do with real time streaming in H.264 based video streams. Videos transmitted over a communication channel are highly prone to errors; it can become critical when UDP is used. In such cases real time error concealment becomes an important aspect. A subclass of the error concealment is the motion vector recovery which is used to conceal errors at the decoder side. Lagrange Interpolation is the fastest and a popular technique for the motion vector recovery. This paper proposes a new system architecture which enables the RTP-UDP based real time video streaming as well as the Lagrange interpolation based real time motion vector recovery in H.264 coded video streams. A completely open source H.264 video codec called FFmpeg is chosen to implement the proposed system. Proposed implementation was tested against the different standard benchmark video sequences and the quality of the recovered videos was measured at the decoder side using various quality measurement metrics. Experimental results show that the real time motion vector recovery does not introduce any noticeable difference or latency during display of the recovered video. Index Terms—Digital Video, Motion Vector, Error Concealment, H.264, UDP, RTP I. INTRODUCTION Streaming of videos is a very common mode of video communication today. Its low cost, convenience and worldwide reach have made it a hugely popular mode of transmission. The videos can either be a pre-recorded video sequence or a live video stream. The videos captured are raw videos, which take up lot of storage space. Video compression technologies have to be widely employed in video communications systems in order to meet the channel bandwidth requirements. The H.264 is currently one of the latest and most popular video coding standard [1]. Compared to previous coding standards, it is able to deliver higher video quality for a given compression ratio, and better compression ratio for the same video quality. Because of this, variations of H.264 are used in many applications including HD-DVD, Blu-ray, iPod video, HDTV broadcasts, and most recently in streaming media. The compressed videos are sent in the form of packets for streaming media. The packets sent are highly prone to erroneous transmission. The packet may be damaged or may not be received at all. Such errors are likely to damage a Group of Blocks (GOB) of data in the decoded frames for block-based coding schemes such as H.264. Error also propagates due to high correlation between neighboring frames and degrades the quality of successive frames. The Real Time Protocol (RTP) over User Datagram Protocol (UDP) is the recommended and commonly used mechanism employed while streaming the H.264 media format [2]. Various approaches have been used to achieve error resilience in order to deal with the above problem. A nice overview of such methods is given in [3], [4]. One of the ways to overcome this problem is the implementation of Error Concealment (EC) at the decoder side. Motion Vector Recovery (MVR) is one way of EC which uses several mathematical techniques to recover the erroneous motion fields. Among the various MVR techniques reported in the literature, the Lagrange Interpolation (LAGI) is the fastest and a popular MVR technique which produces the high quality of the recovered video [5]. FFmpeg is a comprehensive multimedia encoding and decoding library that consists of numerous audio, video, and container formats [6]. This paper proposes a new system architecture which enables the real time video streaming as well as the real time MVR. The proposed architecture is implemented in both FFmpeg coder and decoder. A RTP packet encapsulation/decapsulation module is added to the FFMpeg codec, which packs/unpacks the Network Abstraction Layer (NAL) packets [2], [7] into single NAL unit type RTP packets. A UDP socket program is used at both coder and decoder sides to stream the RTP packets over UDP. A LAGI based MVR technique is implemented at decoder side. The proposed system implementation was tested against the different benchmark video sequences. Quality of the received videos can be measured using various quality measurement standards such as Peak Signal to Noise Ratio (PSNR) and Video Quality Evaluation Metric (VQM) [8] tools. The experimental section presents a brief analysis of which quality measurement parameter is best suited for the streaming video analysis. Experimental results show that the proposed implementation does not introduce any latency or degradation in the quality of the recovered video