Three-Dimensional Transforms and Entropy Coders for a Fast Embedded Color Video Codec Vanessa Testoni and Max H. M. Costa School of Electrical and Computer Engineering - State University of Campinas (UNICAMP) (vtestoni,max}@decom.fee.unicamp.br Abstract This work compares the performances of two fast 3-D transforms and two adaptive Golomb entropy coders applied to a video codec system named FEVC (Fast Embedded Video Codec). The compared transforms are Hadamard (4x4x4 and 8x8x8) and H.264/AVC integer DCT (4x4x4). The compared adaptive Golomb entropy coders have different operation modes and adaptation strategies. New 3-D implementation methods for the transforms are presented. After the scan procedure, the encoding of the 3-D coefficients is done, bit-plane-by-bit-plane, by the entropy coders, producing a fully embedded output bitstream. The FEVC (also described here) was developed to be implemented each of a large number of set-top boxes used in a fiber optics network. For that reason, it is focused on reduced complexity and execution time, not on high compression rates. The use of meager computational resources is also required. Even with these constraints, good distortion versus rate results were achieved. 1. Introduction The comparisons between the two fast 3-D transforms [1] [2] and between the two Golomb entropy coders [3] [4] presented here are studies to improve the performance of a color video codec named Fast Embedded Video Codec (FEVC). The new transform implementation method presented here is also applied to the codec. The FEVC was developed in C# language to be executed in a set-top box device under development. A large number of these set-top boxes will be the interface between a fiber optics network and its users. This device will receive digital signals, extract audio, video and data information and send the processed information to an output device. Among other functions, such as Internet accessibility and voice over IP, the set-top box will be able to receive and transmit video signals coming from, for example, video on demand and video conference applications. Research on video coding systems typically looks for techniques that can reach the highest possible compression rate while not exceeding a given level of distortion. This compression rate increase is generally achieved by means of increased coding complexity, which is supported by the availability of increasing computational power. However, in some video coding applications, the use of high capacity processors is not the most convenient choice. These situations require video codecs focused on reduced execution times and reduced computational complexity, and less concerned with high compression performance. This is the profile of the FEVC. Also, in some cases, the codecs are to be implemented by software only, as hardware implementations may not be admitted. In order to reduce the codec execution times, the very simple Hadamard (8x8x8 and 4x4x4) transform and the H.264/AVC integer DCT-like (4x4x4) transform are used instead of the traditional DCT. These transforms were chosen because they are able to reduce the correlation between coefficients and their implementations require only additions and bit shifts. To further reduce execution times, motion estimation (ME) and compensation (MC) techniques are avoided. This are high performance techniques but time consuming. Instead, 3D transforms are used to reduce correlation in both spatial and temporal dimensions. After transforming 3-D blocks of pixels, the codec reads and reorders each coefficients block. It was found that the probability distribution of the dominant AC coefficients is spread along the major axes of the 3-D cubes, just as found for 3-D DCT cubes [5] [6]. It was also found that the cube energy is concentrated according to the coefficient sequency number, a concept related to the notion of frequency, in the three dimensions. To benefit from this energy distribution pattern, a scan order [7] based on the multiplication of the three sequency numbers of each coefficient is