Evaluation of Depth-Based Super Resolution on Compressed Mixed Resolution 3D Video Michal Joachimiak a , Payman Aflaki a , Miska M. Hannuksela b , Moncef Gabbouj a a Tampere University of Technology, Tampere, Finland b Nokia Research Center, Tampere, Finland Abstract. The MVC+D standard specifies coding of Multiview Video plus Depth (MVD) data for enabling advanced 3D video applications. MVC+D defines that all views are coded with H.264/MVC encoder at equal spatial resolution. To improve compression efficiency it is possible to use mixed resolution coding in which part of texture views are coded at reduced spatial resolution. In this paper we evaluate the performance of Depth-Based Super Resolution (DBSR) on compressed mixed resolu- tion MVD data. Experimental results show that for sequences with accu- rate depth data the objective coding performance metric increases. Even though some sequences, with poor depth quality, show slight decrease in coding performance with respect to objective metric, subjective evalua- tion shows that perceived quality of DBSR method is equal to symmet- ric resolution case. We also show that depth re-projection consistency check step of the DBSR can be changed to simpler consistency check method. In this way the DBSR computational complexity is reduced by 26% with 0.2%dBR average bitrate reduction for coded views and 0.1% average bitrate increase for synthesized views. We show that proposed scheme outperforms the anchor MVC+D coding scheme by 7.2% of dBR on average for total coded bitrate and by 10.9% of dBR on average for synthesized views. 1 Introduction 3D video consumer devices, including video cameras and displays, start to emerge on the market. To store 3D video data efficiently new compression methods are required. As a response to the growing need for 3D video compression the Moving Picture Experts Group (MPEG) initiated 3D video standardization process [5], that has been continued by the Joint Collaborative Team on 3D Video Coding (JCT-3V) since July 2012. 3D video consists of a set of 2D video sequences, reg- istered by cameras synchronized in time. Video acquisition using many cameras is challenging and some displays like autostereoscopic displays (ASD) require many views on input. However, encoding and transmission of many views would require a great amount of processing power and bandwidth. With the help of DIBR [16] techniques it is possible to register and encode a lower amount of views and synthesize missing views from decoded texture and corresponding depth