Transform-Domain Intra Prediction for H.264
Chen Chen and Ping-Hao Wu
Graduate Institute of Communications Engineering,
National Taiwan University,
Taipei, Taiwan 10617, R.O.C.
Email: {r92942081, b89901043}@ntu.edu.tw
Homer Chen
Department of Electrical Engineering, GICE, and INM.
National Taiwan University,
Taipei, Taiwan 10617, R.O.C.
Email: homer@cc.ee.ntu.edu.tw
Abstract—H.264/AVC is the newest video coding standard
jointly developed by the ITU-T Video Coding Experts Group
and the ISO/IEC Moving Picture Experts Group. In contrast
to some previous coding standards such as H.263+ and MPEG-
4 Part-2, where intra prediction is performed in the transform
domain, the intra prediction of H.264 is completely defined in
the pixel domain. This presents a challenge to multimedia
systems in which transcoding is conducted in the transform
domain for the purpose of computational efficiency. In this
paper, we show how to obtain the transform domain
predictions for various intra modes of H264. We begin by
converting the intra prediction from the pixel domain to the
transform domain through matrix manipulation. Then we
show how the operations involved in the matrix manipulation
can be simplified. A computational complexity analysis of each
intra prediction mode of H.264 is provided.
I. INTRODUCTION
In view of the successful adoption of MPEG-2
technology in the DVD and DTV industries and the growing
market potential of the newly developed H.264 standard, we
are motivated to investigate the conversion of existing
MPEG-2 multimedia content to the H.264 format. The
overall architecture of our proposed scheme for MPEG-2 to
H.264 transcoding is described in [6]. In this paper, we
present the details of our transform-domain approach for
H.264 intra prediction.
H.264/AVC [1] is the newest video coding standard
jointly developed by the ITU-T Video Coding Experts
Group and the ISO/IEC Moving Picture Experts Group. It
provides better compression efficiency than previous video
coding standards. Transcoding is one of the key
technologies for enabling interoperability between different
systems and devices [1], [2]. The purpose of transcoding is
to convert a multimedia signal from one format to another.
The format conversion is characterized by bitrate, frame rate,
coding standard, etc.
In contrast to some previous coding standards such as
H.263+ and MPEG-4 Part-2, where intra prediction is
performed in the transform domain, the intra prediction of
H.264 is completely defined in the pixel domain by
referring to neighboring samples of the previously coded
blocks. Transcoding in the transform (or compressed)
domain has been a desired approach [3], [4] because it
avoids complete decoding and re-encoding, which is
computationally expensive. Therefore, there is a need to
investigate transform-domain intra prediction for H.264.
This paper is organized as follows. In Section II, we
describe the proposed transform-domain intra prediction
algorithm. The complexity of the algorithm is presented In
Section III. Finally, a conclusion is given in Section IV.
II. INTRA PREDICTION
In H.264, there are a total of 9 prediction modes for a
4 4 × sub-block and 4 for a 16 16 × macroblock. If a sub-
block or a macroblock is to be coded in intra mode, a
prediction block is formed based on the neighboring
samples of previously-coded blocks that are to the left
and/or immediately above the block to be coded. The
prediction block is then subtracted from the current block
prior to encoding. In this section, the transform-domain
intra prediction for luma samples is discussed. The
transform-domain intra prediction of chroma samples can be
easily derived in a similar way and is not described here.
A. Intra 4 4 ×
The intra prediction of 4 4 × blocks defined by H.264 is
illustrated in Fig. 1(a), where the symbols a to p denote the
pixels of current block, and the symbols A to M denote the
neighboring pixels based on which the prediction block is
calculated. The direction of prediction is show in Fig. 1(b).
Due to the page limit, we choose the first four modes of the
4 4 × intra prediction to illustrate our approach.
1) Vertical
The prediction samples in this mode (mode 0) are
obtained from the four neighboring pixels (A to D) above
the block to be coded. Sample A is copied to every pixel in
the first column of the block, sample B is copied to every
pixel in the second column of the block, and so on.
This work was supported in part by the National Science
Council of Taiwan under contract NSC93-2752-E-002-006- PAE.
(a) (b)
Figure 1. Illustration of 4 4 × intra prediction
1497 0-7803-8834-8/05/$20.00 ©2005 IEEE.