Backward-Compatible Robust Error Protection of JPEG XR Compressed Video Paolo Micanti, Giuseppe Baruffa, Fabrizio Frescura, Leonardo Angelini, Maurizio Caon Department of Information and Electronic Engineering University of Perugia Perugia, Italy {micanti, baruffa, frescura}@diei.unipg.it Abstract— The new JPEG XR image encoding standard offers a great compression rate while maintaining a good visual quality. Nonetheless, it has low error robustness, making it unusable in case of unreliable transmission over error prone channels, e.g., wireless channels. An improvement to the standard was developed, which can correct transmission errors, both bit or packet losses, and which is fully compatible with legacy decoders. Data interleaving and channel coding can offer a good protection against transmission errors; different levels of protection can be adopted, in order to trade-off between error protection capabilities and decompressed image quality. Keywords- JPEG XR; error protection; wireless image transmission; data interleaving I. INTRODUCTION The heavy limitations of the JPEG image standard motivated the development of a new standard that could achieve better performance. Considering the scenario of a low bitrate channel used to transmit the compressed images, the JPEG codec is inadequate because of the evident artifacts introduced in images with sharp-cutting contours and text, especially at high compression rates. Two standards have been developed to replace the old JPEG: JPEG 2000, created by the Joint Photographic Experts Group committee, and JPEG XR, developed by Microsoft. JPEG XR (ISO/IEC 29199-2) became an International Standard in June 2009 and is also an ITU-T Recommendation (T.832) [1]. The Motion JPEG XR specification for support of video sequences is currently approaching the Final Committee Draft phase of the ISO/IEC approval process. In this paper an improvement to the JPEG XR codec is presented, with the aim of obtaining robustness to errors due to transmission or storage on unreliable channels, such as the wireless one. Especially in case of video delivery, a return channel might not be available for the request of missing data (as could happen for the transfer of static images). In this case the original JPEG XR codec does not provide error correction and this leads to a deterioration of the quality of the image and, possibly, to the loss of the whole frame. Interferences and fading phenomena could lead to the loss of many frames, making vision very difficult. Also, using the last correct frame in place of lost ones, results will be characterized by poor quality. Therefore, redundancy must be introduced to repair partially damaged images. The redundancy will be placed directly in the coded stream, changing the JPEG XR container structure transparently to a decoder that does not implement correction functions. The choice of the JPEG XR codec is due to the need of an efficient compressor of images, in order to have a transmission bandwidth consistent with the capabilities of current wireless technologies. JPEG XR codec capabilities, such as a better handling of text and graphics, could be exploited in particular applications, e.g., remote computer desktop or projector handling. Moreover, the codec is already integrated in popular operating systems and in other software. A similar improvement has been already developed for the JPEG 2000 codec [2], [3], [4]. II. OVERVIEW OF JPEG XR The JPEG XR image standard originated from the HD Photo project. In 2007, it has been proposed to the Joint Photographic Expert Group as a standard to replace the old JPEG format. In 2009, JPEG XR became an ISO/IEC standard. The acronym XR stands for eXtended Range and denotes the codec capabilities to manage a conversion space of 16 or 32 bits instead of usual 8 bits. Therefore, JPEG XR allows for High Dynamic Range (HDR) shooting and picture exposure control without any or with few information loss. The main features of the JPEG XR algorithm are the Lapped Biorthogonal Transform (LBT) and an advanced coefficient coding scheme [5]. LBT is composed by two operators, Photo Core Transform (PCT) and Photo Overlap Transform (POT). These operators apply to four 4×4 pixel blocks that form 4×4 blocks macroblocks. Therefore, each macroblock contains 16 blocks. PCT is the main transform, which leads to spatial decorrelation between blocks but does not avoid artifacts at the edges of adjacent blocks, typical of the JPEG codec at high compression levels. To avoid these artifacts, the POT operator must be used to overlap blocks. This overlap is optional and can be chosen during the encoder setup. The two operators are alternated in different steps to generate the LBT transform. Coefficients in frequency domain for each