Optimal video coding for bit-rate switching applications: a game-theoretic approach Stefania Colonnese, Gianpiero Panci, Stefano Rinauro, and Gaetano Scarano Dip. INFOCOM, Università “La Sapienza” di Roma via Eudossiana 18, 00184 Roma, Italy e-mail: {colonnese,panci,rinauro,scarano}@infocom.uniroma1.it. Abstract In this work 1 we discuss a game theoretic approach to bitstream switching in video coding. Fast and bit-saving video bitstream switching is an important issue in video communication system on time varying channels. The most recent video coding standard, namely H.264, support the seamless switching among bitstreams coded at different bitrates by means of suitably coded frames, named Switching Pictures. Since the rate-distortion char- acteristics of switching frames differ from those of I and P frames, their location affect both the bit-rate and the quality of the coded sequence. In this work, we address the optimization of the SP frames location under an assigned bitrate budget. At this aim we restor to a game theoretic approach and we show that the optimal solution is met when the SP frames are assigned to the frame with the smallest innovation. Experimental results show the advantage in terms of both rate and distortion achieved by the optimized Switching frame insertion with respect to basic H.264 coding. I. Introduction Fast and bit-saving video bitstream switching is an important issue in video communication on channels varying due to mobility and/or handover in heterogeneous networks. The most recent video coding standard, namely ITU-T Rec. H.264 introduces two new frame types, SP-frames and SI-frames, that can be perfectly reconstructed even when different reference frames are used for their prediction [1]. This property allows seamless bitstream switching, at a lower cost than using I-frames to provide random access. Theoretical and empirical rate-distortion curves of SI and SP frames have been provided in [2]. Since the rate-distortion curve of SI and SP frames are different from those of usual I and P frames, it is worth investigating how the switching frames can be introduced under suitable optimality criteria. Although the rate-distortion optimization issue is widely dis- cussed in video coding literature [3], [4], it is still of great interest and novel results have been recently provided by resorting to a game theory based technique for optimizing the bit rate control 1 This work is partially supported by Italian National project Wireless 8O2.16 Multi-antenna mEsh Networks (WOMEN) under grant number 2005093248. in video coding [5]. In [5], the authors optimize the perceptual quality of the decoded sequence while guaranteeing “fairness” in bit allocation among macroblocks. Since the whole frame is an entity perceived by viewers, macroblocks represent players that compete cooperatively under a global objective of achieving the best quality with the given bit constraint. Following the approach of the optimization in [5], here we formulate the problem of video coding for bitstream switching by representing the frames of a sequence as players and the overall sequence quality as the objective function. The strategy of each player is the choice of the coding mode and the allocated bits. Experimental results are provided to show the performance of the optimized coding strategy in a video streaming environment allowing bitstream switching. II. Bitstream switching using SP frames The H.264 video coding structure introduces two special syntactic structures, named Switching Intra (SI) and Switching Predicted (SP) frames, that allow drift-free switching between different coded bitstreams. In fact the switching pictures provide access points to the coded bitstream, so that during the commu- nication the bit-rate can be dynamically changed to adapt to the network conditions resorting to pre-coded bitstream. Each switching frame has a primary and secondary represen- tation. The primary representation is sent along each bitstream and it provides the virtual access point to the bitstream for users incoming by other bitstream. The secondary representation is sent only at the switching phase, and it allows decoding exactly the same frame as the primary representations, while using different reference frames. In other words the primary and secondary coded frames are two alternative representations of the same frame, differing only in the prediction step. An example of SP frames application is shown in Fig.1, rep- resenting two bitstreams and the corresponding decoded frames. The frames ranging from 1 to 3 are decoded from the first coded bitstream, while frames 5 to 7 are decoded from the second bitstream. The frame 4 is decoded using its secondary represen- tation, transmitted only when bitstream switching is performed, and prediction from the first bitstream. Thanks to Switching Pictures features, frame 4 is identical to the frame 4 decoded using its primary representation and prediction from the second bitstream. The adoption of SP frames is particularly useful in