ITERATIVELY DECODABLE CODES FOR WATERMARKING APPLICATIONS Mustafa Kesal, M. Kıvan¸ c Mıh¸cak, Ralf Koetter and Pierre Moulin U. of Illinois, Coordinated Science Research Lab. and ECE Dept. 1308 W. Main St., Urbana, IL 61801 Email: {kesal, koetter}@comm.csl.uiuc.edu, {mihcak, moulin}@ifp.uiuc.edu ABSTRACT The problem of information hiding or watermark- ing is investigated. Based in an information theo- retic analysis of the watermarking task we investigate a strategy to employ binary codes to robustly hide data in a given host signal. The central theme of the prob- lem is the interplay of a vector quantization and a channel coding task. Turbo codes and other coding schemes are compared in terms of their performance in a watermarking application. Keywords: Watermarking, vector quantization, turbo codes, iterative decoding, product codes 1. INTRODUCTION Data hiding refers to nearly invisible embedding of information within a host data set such as text, au- dio, image, or video. Applications include watermark- ing, steganography, image databases, and in-band cap- tioning. Most watermarking research to date has fo- cused on novel ways to hide information and to de- tect and/or remove hidden information. However, a rigorous theory describing fundamental limits of any watermarking system is just emerging. The water- marking problem can be viewed as a modified joint source-channel coding problem, with the given distor- tion and rate constraints. The main role played by the channel code is in fact to embed the watermark signal into the original data with a small distortion and provide a protection for the original and the watermark signal against the at- tackers distorting signal. In this paper, we first in- troduce the general watermarking problem along with the optimum strategies to be followed by the intended sender and the attacker. A simple quantization ap- proach combined with channel coding to this problem will be analyzed. In particular, we focus on iterative decoding techniques and the applicability of iteratively decodable codes to the watermarking problem. Plots for the various codes performances and the future work will be discussed as a final section. 2. WATERMARKING A theory has recently been developed to establish the fundamental limits of the fairly general data hiding problem described below [1, 2] see Fig. 1. A mes- sage M is to be communicated to a receiver. The message is embedded into a length-N sequence ˜ X N = ( ˜ X 1 , ..., ˜ X N ) termed host data set, typically data from a host image, video, or audio signal. The embedding is done using a cryptographic key ˜ K N =( ˜ K 1 , ..., ˜ K N ) that is also available at the decoder. The resulting wa- termarked data or composite data X N =(X 1 , ..., X N ) is subject to attacks that attempt to remove any trace of M from X N . The data-hiding process should be transparent: X N should be similar to ˜ X N , according to a suitable distortion measure. The system should also be robust: the hidden message should survive any attack (within a reasonable class of attacks). A typi- cal restriction on the attacker is that there is a limit on the amount of distortion that he/she is willing to introduce. X ~ N K N M ^ Y N X N M Decoder Q(y|x) Encoder Figure 1: A general setup for watermarking scheme depicted The system can be analyzed by defining a statis- tical model for M , ˜ X N and K N , a distortion func- tion, specifying constraints on the admissible distor- tion levels D 1 and D 2 for the data hider and the at- tacker, and specifying the information available to all parties. Then one can seek the maximum rate of reli- able transmission for M , over any possible data-hiding strategy and any attack that satisfy the specified con- straints. This is done by application of information- theoretic principles, and in particular upon the fol- lowing fundamental concept, which so far appears to have been overlooked in the watermarking literature: Even if the host signal ˜ X N is not available at the de- coder (blind watermarking), the fact that the encoder