CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE
Concurrency Computat.: Pract. Exper. (2011)
Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/cpe.1911
Optimizing H.264/AVC interprediction on a
GPU-based framework
Rafael Rodríguez-Sánchez
1,
*
,†
, José Luis Martínez
2
, Gerardo Fernández-Escribano
1
,
José L. Sánchez
1
, José M. Claver
3
and Pedro Diaz
1
1
Instituto de Investigación en Informática de Albacete, Universidad de Castilla-La Mancha, Avenida de España s/n,
02071, Albacete, Spain
2
Architecture and Technology of Computing Systems Group, Complutense University, Madrid, Spain
3
Departamento de Informática, Universidad de Valencia, Avenida de Vicente Andrés Estellés, s/n, 46100 Burjassot,
Valencia, Spain
SUMMARY
H.264/MPEG-4 part 10 is the latest standard for video compression and promises a significant advance
in terms of quality and distortion compared with the commercial standards currently most in use such as
MPEG-2 or MPEG-4. To achieve this better performance, H.264 adopts a large number of new/improved
compression techniques compared with previous standards, albeit at the expense of higher computational
complexity. In addition, in recent years new hardware accelerators have emerged, such as graphics process-
ing units (GPUs), which provide a new opportunity to reduce complexity for a large variety of algorithms.
However, current GPUs suffer from higher power consumption requirements because of its design. Up to
now, GPU-based software developers have not taken this into account. In this paper, we present a detailed
procedure to implement the H.264 motion estimation for a GPU, with the aim of reducing time and, as
a consequence, the energy consumption. The results show a negligible drop in rate distortion with a time
reduction of over 91.5% on average and it reduces the energy consumption by a factor of 11.78 compared
with the reference implementation. Copyright © 2011 John Wiley & Sons, Ltd.
Received 15 February 2011; Revised 3 October 2011; Accepted 31 October 2011
KEY WORDS: heterogeneous computing; H.264/AVC; motion estimation
1. INTRODUCTION
H.264/AVC is the most recent predictive video compression standard that outperforms other previ-
ously existing video codecs [1]. The H.264/AVC standard builds on those previous coding standards
to achieve a compression gain of about 50%, largely at the cost of increased encoder [1]. These com-
pression gains are mainly related to the variable block-size motion compensation, improved entropy
coding, multiple reference frames, and a smaller block transform, among others.
In addition to this, in the past few years new heterogeneous architectures have been introduced
in high-performance computing [2]. Examples of such architectures are graphics processing units
(GPUs). GPUs are small accelerator devices with hundreds of similar processing cores that are
designed and organized with the goal of achieving higher performance. Although GPUs can be used
for general purposes, they come primarily from multimedia and computer or console gaming.
Furthermore, the most important GPU vendors, NVIDIA and AMD/ATI, have provided sev-
eral tools to facilitate the programming of these devices. CUDA (compute unified device
architecture) [3], introduced by NVIDIA in 2007, is designed to support joint CPU/GPU
*Correspondence to: Rafael Rodríguez-Sánchez, Instituto de Investigación en Informática de Albacete, Universidad de
Castilla-La Mancha, Avenida de España s/n, 02071, Albacete, Spain.
†
E-mail: rrsanchez@dsi.uclm.es
Copyright © 2011 John Wiley & Sons, Ltd.