LOSSLESS DEPTH MAP CODING USING BINARY TREE BASED DECOMPOSITION AND CONTEXT-BASED ARITHMETIC CODING Shampa Shahriyar 1,4 , Manzur Murshed 2,5 , Mortuza Ali 3,6 , Manoranjan Paul 2,5 1 Faculty of Information Technology, 2 Faculty of Science and Technology, 3 School of Computing and Mathematics 4 Monash University, Australia, 5 Federation University Australia, 6 Charles Sturt University, Australia ssha204@student.monash.edu, {manzur.murshed, mortuza.ali}@federation.edu.au, mpaul@csu.edu.au ABSTRACT Depth maps are becoming increasingly important in the con- text of emerging video coding and processing applications. Depth images represent the scene surface and are character- ized by areas of smoothly varying grey levels separated by sharp edges at the position of object boundaries. To enable high quality view rendering at the receiver side, preservation of these characteristics is important. Lossless coding enables avoiding rendering artifacts in synthesized views due to depth compression artifacts. In this paper, we propose a binary tree based lossless depth coding scheme that arranges the residual frame into integer or binary residual bitmap. High spatial cor- relation in depth residual frame is exploited by creating large homogeneous blocks of adaptive size, which are then coded as a unit using context based arithmetic coding. On the stan- dard 3D video sequences, the proposed lossless depth coding has achieved compression ratio in the range of 20 to 80. Index Terms— Binary Tree Based Decomposition, Con- text Adaptive Arithmetic coding, Lossless Depth Coding, Motion Vector Coding. 1. INTRODUCTION Acquisition, storage, and transmission of depth maps, along with textures, are becoming increasingly important in emerg- ing video coding and processing applications. For example, in 3D and multi-view video coding applications depth maps are used for disparity compensation and view synthesis using depth image based rendering (DIBR) technique [1]. Besides, depth maps can facilitate improving the accuracy of video processing applications, such as, activity detection, object de- tection, object tracking, scene analysis, and augmented real- ity [2]. Standard image and video coding schemes, such as, JPEG [3], JPEG-LS [4], H.264/AVC [5], HEVC [6] are in- adequate for depth map compression as the characteristics of depth maps are signiﬁcantly different from that of natural im- ages and video textures. Besides, the usage of depth map in video coding and processing applications requires the depth coding schemes to satisfy additional requirements. For exam- ple, depth images are more homogeneous or smooth except for the boundary regions that represent the shape of the ob- jects. Therefore, depth value discontinuity occurs mostly on the edges. Consequently, edges in depth maps are more sen- sitive to coding errors that affect perceptual video quality due to misinterpretation of foreground and background pixels. Existing lossy compression techniques are effective in coding large texture-less regions within depth maps; on the other hand, they tend to obscure the depth edges after trans- formation and high quantization operations. Lossy compres- sion of depth map brings geometrical distortion to synthetic view. In some scenarios, highly efﬁcient lossless compres- sion of depth data can provide an attractive alternative to lossy coding since no distortion due to depth coding is introduced on synthesized views. In case of natural images, lossless coding typically provides compression ratio of about 2 to 3, which makes the use of lossless compression infeasible for many practical applications. As for depth images, prelimi- nary results show that much higher compression ratios can be achieved when specialized algorithms are applied: 15 to 30 in case of intra coding and 20 to 50 in case of stereo depth images [7]. Only a few works on efﬁcient lossless compression of mono-view depth data have been proposed in the literature. In [8] a bit-plane-based approach using binary shape cod- ing and motion estimation was presented for depth video se- quences by Kim et al. The scheme proposed to decompose the original frame into several bit planes by ﬁrst transform- ing the depth values using gray code. Then, the decomposed bit planes were encoded independently and in-order, starting from the MSB bit plane to the LSB bit plane. The encoding process is thus repeated eight times. Zamarin and Forchham- mer [2] modiﬁed the technique of Kim et al by changing the prediction template (inter and intra) and extending the predic- tion into view level using stereo disparity images and dispar- ity warping. Since CABAC (context-adaptive binary arithmetic cod- ing) [9] was originally designed for lossy texture coding, it