AN AUDIO WATERMARKING SCHEME BASED ON AN EMBEDDING STRATEGY WITH MAXIMIZED ROBUSTNESS TO PERTURBATIONS C. Baras, N. Moreau * GET - T´ el´ ecom Paris TSI Department 46 rue Barrault, 75013 Paris, FRANCE P. Dymarski Technical University of Warsaw 15/19 Nowowiejska, 00-665 Warsaw, POLAND ABSTRACT A particular application of audio watermarking systems consists in using the audio signal as a transmission channel for binary in- formation. In this context, reliability criteria with respect to trans- mission rate, such as Bit Error Rate (BER), is a major issue that deﬁnes system performance. Current research has already proven the advantage of using informed embedding rather than blind em- bedding. In this paper, we present a new informed watermarking system based on an embedding strategy that maximizes robustness to perturbations such as compression and analog transmission. The proposed system performance is then compared to the equivalent blind embedding system performance. Experimental results per- mit to assess the achieved efﬁciency. For transmission rates, up to 500 bits/s, BERs are divided by 10 when channel perturbation is compression and divided by 4 in the case of analog transmission. 1. INTRODUCTION Digital representation of audio signals has made data access easier and fostered illegal data exchange possibilities. Copyright protec- tion has therefore become a major issue. For several years, water- marking techniques have been developed as a possible solution to face this problem. Scientiﬁc studies pointed out new watermark- ing application ﬁelds (broadcast monitoring or transactional wa- termarking for example). Our main concern is the particular case where audio signal can be viewed as a transmission channel, which can embed binary information. The watermarking process is then designed as a communication system with additive noise. The use- ful information (the watermarking) is hidden by noise (the ”host” audio signal). In this context, the watermark has to be robust to classical distortions (which will be referred to as channel pertur- bations) such as compression or digital-to-analog and analog-to- digital conversion. The embedding strategy aims at transmitting as much information as possible with the best reliability. Thus, transmission rate (in bits per second) and Binary Error Rate (BER) deﬁne system performance. When choosing the embedding process, it is necessary to con- ciliate perceptual distortion and watermarking detection constraints. These constraints can be represented by two regions of the audio signal space [1]. The embedder role consists in choosing a wa- termark which lies in the intersection of these two regions. The way of choosing the watermark depends on the required embed- ding strategy. The most promising, pointed out by Cox in [2], * Thanks to ARTUS RNRT project for funding (http://www.telecom.gouv.fr/rnrt/projets/res 01 37.htm). exploits the similarity between an additive watermarking system and a communication system with side information. Using the a priori knowledge of the host signal for deﬁning an embedding strategy improves data hiding capacity. Indeed, Costa [3] proved that the capacity of such a communication system is only depen- dent on both channel perturbation and watermarking power values but not on the host signal. Moreover, he proposed an embedding scheme based on a structured codebook. Nevertheless, Costa’s em- bedder implementation is limited by the codebook size. Current researches aim at approaching this model by proposing structured codebooks, designed with quantization processes or methods of fast codebook search. Most work use theoretical channel capacity as a design criterion as suggested by Costa, not taking into ac- count that channel perturbation power may not be known during the embedding process. We propose a watermarking system using side-information and maximizing robustness to channel perturba- tion without making any hypothesis on perturbation power. The system is designed as a closed loop scheme introducing a local copy of the detection process at the embedder to take into account the knowledge of the audio signal. Its performance will be com- pared to the equivalent blind embedding scheme. The outline of the paper is the following. In section 2, audio watermarking principles and the reference blind embedding sys- tem (BSE) are described. In section 3, the closed loop watermark- ing scheme and the embedding strategy maximizing robustness to perturbation which will be referred to as ESMR are presented. Ex- perimental results are given in section 4 which allows us to analyze the impact of our embedding strategy on system performance. 2. AUDIO WATERMARKING SYSTEM PRINCIPLES A reference audio watermarking system processing digital signals was developed in [4]. It was designed as a communication system, as shown in ﬁgure 1. Source encoding process maps the hidden message into a se- quence of L symbols {k l } l=1..L . Each symbol is chosen among the set {1, ..., M } and codes for N bs = log 2 (M) binary digits. The embedding process requires an embedding codebook S con- taining M waveforms with length N : S = {s m }m=1..M. The modulation interface maps each symbol k l into the k l -th code- book waveform so that the modulated signal on each symbol in- terval [(l − 1)N...lN − 1] is : v = s k l . To satisfy the inaudi- bility constraint the power spectral density of the embedded signal should be lower than a masking threshold given by a PsychoA- coustic Model (PAM). This threshold is to be taken into account when designing the ﬁlter H(f ) with impulse response h(n). Fil- IV - 357 0-7803-8484-9/04/$20.00 ©2004 IEEE ICASSP 2004 ➠ ➡