ROYALTY COST BASED OPTIMIZATION FOR VIDEO COMPRESSION Emrah Akyol, Onur G. Guleryuz, and M. Reha Civanlar {eakyol, guleryuz, rcivanlar}@docomolabs-usa.com DoCoMo USA Labs, 3240 Hillview Avenue, Palo Alto, CA 94304, USA ABSTRACT A video compression standard incorporates many tools and technologies which must be licensed by systems that deploy the standard. The licensing determines the royalty costs that must be paid to the holders of intellectual property on the respective tools. With current abundance of well understood and effective video compression tools, one can imagine the formation of cross cutting tool libraries with tools drawn from different video compression standards. This allows dynamic selection from a large pool of tools, having potentially overlapping functionality, when encoding individual video sequences. In this paper we examine the royalty cost aspect of the scenario where video is encoded using a library of royalty bearing tools by considering encoding that jointly optimizes rate, distortion, and royalty cost. We provide a system that optimizes video delivery under various licensing conditions imposed on tool intellectual property. We present an example of royalty based encoding (using assumed royalty costs) to show the merit of the proposed framework. 1. INTRODUCTION In recent video coding standards (such as the H.264/AVC standard [1]) significant improvements in rate-distortion performance have been made possible by the incorporation of a substantial number of new tools such as variable block sized motion estimation, intra prediction, quarter pixel motion compensation, multi-frame and multi-hypothesis motion estimation (ME), adaptive de-blocking filter, and context based arithmetic coding, just to name a few. Using just the toolsets collected from currently deployed standards, it is clear that today one can transport video through a variety of networks using a vast range of tools that correspond to a vast range of efficiencies in the end to end delivery. The currently developed MPEG-RVC (Reconfigurable Video Coding) standard [1] allows the construction of tool libraries and provides a description language where an encoder can signal to a decoder which subset of the tools within a library are utilized on a particular video sequence. When combined with the proposed work, such cross cutting libraries can facilitate many applications that do not readily fit into the target application domain of a single standard or profile. If one is concerned with the highest efficiency media delivery, one is often confined to more recent, state-of-the- art tools that tend to have high royalty costs. If on the other hand, one allows some inefficiency, it may be possible to accomplish delivery with reduced royalty costs or even free of any royalty costs. Given that the encoded media itself may also have content licensing costs, the system we consider finds the optimal trade-off by encoding media in a way that minimizes the combined royalty cost for the desired media quality level and effective bandwidth of the transport medium. In essence, the encoding we propose optimizes performance over a function that describes optimal encoding points for the allowed range of triplet values formed by distortion, rate, and royalty cost. Video segment Flexible Encoder Bitstream Tool set selection Encoding parameters Registry Media Server Figure 1: The general scheme of the proposed framework In this paper we provide a system, composed of a server, a registry, and users, which optimizes video delivery under various licensing conditions imposed on tool intellectual property. The optimization is such that at the same decoded video quality, video encoded for high bandwidth environments can be decoded using lower efficiency but reduced royalty cost tools, whereas video encoded for low bandwidth environments can be decoded using higher efficiency but increased royalty cost tools. A general scheme for the proposed royalty based encoding is shown in Figure-1. In this framework each video segment is encoded with possibly different set of tools depending on the total rate, distortion, and royalty cost constraints in a content adaptive fashion. We assume a flexible encoder can use any subset of the toolset, , which is known to both at the encoder and decoder. We assume that each tool in has a royalty cost that determines the cost of licensing that tool for use in coding