ACEEE Int. J. on Signal & Image Processing, Vol. 02, No. 01, Jan 2011
31
© 2011 ACEEE
DOI: 01.IJSIP.02.01.231
Motion Vector Recovery for Real-time H.264
Video Streams
Kavish Seth
1
, Tummala Rajesh
1
, V. Kamakoti
2
, and S. Srinivasan
1
1
Dept. of Electrical Engg., Indian Institute of Technology Madras, Chennai, India
Email: {kavishseth, trajeshreddy}@gmail.com , srini@ee.iitm.ac.in
2
Dept. of Computer Sc. and Engg., Indian Institute of Technology Madras, Chennai, India
Email: veezhi@gmail.com
Abstract— Among the various network protocols that can be
used to stream the video data, RTP over UDP is the best to do
with real time streaming in H.264 based video streams. Videos
transmitted over a communication channel are highly prone
to errors; it can become critical when UDP is used. In such
cases real time error concealment becomes an important
aspect. A subclass of the error concealment is the motion
vector recovery which is used to conceal errors at the decoder
side. Lagrange Interpolation is the fastest and a popular
technique for the motion vector recovery. This paper proposes
a new system architecture which enables the RTP-UDP based
real time video streaming as well as the Lagrange
interpolation based real time motion vector recovery in H.264
coded video streams. A completely open source H.264 video
codec called FFmpeg is chosen to implement the proposed
system. Proposed implementation was tested against the
different standard benchmark video sequences and the
quality of the recovered videos was measured at the decoder
side using various quality measurement metrics.
Experimental results show that the real time motion vector
recovery does not introduce any noticeable difference or
latency during display of the recovered video.
Index Terms—Digital Video, Motion Vector, Error
Concealment, H.264, UDP, RTP
I. INTRODUCTION
Streaming of videos is a very common mode of video
communication today. Its low cost, convenience and
worldwide reach have made it a hugely popular mode of
transmission. The videos can either be a pre-recorded video
sequence or a live video stream. The videos captured are
raw videos, which take up lot of storage space. Video
compression technologies have to be widely employed in
video communications systems in order to meet the channel
bandwidth requirements.
The H.264 is currently one of the latest and most popular
video coding standard [1]. Compared to previous coding
standards, it is able to deliver higher video quality for a
given compression ratio, and better compression ratio for
the same video quality. Because of this, variations of H.264
are used in many applications including HD-DVD, Blu-ray,
iPod video, HDTV broadcasts, and most recently in
streaming media.
The compressed videos are sent in the form of packets
for streaming media. The packets sent are highly prone to
erroneous transmission. The packet may be damaged or
may not be received at all. Such errors are likely to damage
a Group of Blocks (GOB) of data in the decoded frames for
block-based coding schemes such as H.264. Error also
propagates due to high correlation between neighboring
frames and degrades the quality of successive frames. The
Real Time Protocol (RTP) over User Datagram Protocol
(UDP) is the recommended and commonly used
mechanism employed while streaming the H.264 media
format [2].
Various approaches have been used to achieve error
resilience in order to deal with the above problem. A nice
overview of such methods is given in [3], [4]. One of the
ways to overcome this problem is the implementation of
Error Concealment (EC) at the decoder side. Motion
Vector Recovery (MVR) is one way of EC which uses
several mathematical techniques to recover the erroneous
motion fields. Among the various MVR techniques
reported in the literature, the Lagrange Interpolation
(LAGI) is the fastest and a popular MVR technique which
produces the high quality of the recovered video [5].
FFmpeg is a comprehensive multimedia encoding and
decoding library that consists of numerous audio, video,
and container formats [6]. This paper proposes a new
system architecture which enables the real time video
streaming as well as the real time MVR. The proposed
architecture is implemented in both FFmpeg coder and
decoder. A RTP packet encapsulation/decapsulation
module is added to the FFMpeg codec, which
packs/unpacks the Network Abstraction Layer (NAL)
packets [2], [7] into single NAL unit type RTP packets. A
UDP socket program is used at both coder and decoder
sides to stream the RTP packets over UDP. A LAGI based
MVR technique is implemented at decoder side. The
proposed system implementation was tested against the
different benchmark video sequences. Quality of the
received videos can be measured using various quality
measurement standards such as Peak Signal to Noise Ratio
(PSNR) and Video Quality Evaluation Metric (VQM) [8]
tools. The experimental section presents a brief analysis of
which quality measurement parameter is best suited for the
streaming video analysis. Experimental results show that
the proposed implementation does not introduce any
latency or degradation in the quality of the recovered video