AN AUDIO WATERMARKING SCHEME BASED ON AN EMBEDDING STRATEGY WITH
MAXIMIZED ROBUSTNESS TO PERTURBATIONS
C. Baras, N. Moreau
*
GET - T´ el´ ecom Paris
TSI Department
46 rue Barrault, 75013 Paris, FRANCE
P. Dymarski
Technical University of Warsaw
15/19 Nowowiejska, 00-665 Warsaw, POLAND
ABSTRACT
A particular application of audio watermarking systems consists
in using the audio signal as a transmission channel for binary in-
formation. In this context, reliability criteria with respect to trans-
mission rate, such as Bit Error Rate (BER), is a major issue that
defines system performance. Current research has already proven
the advantage of using informed embedding rather than blind em-
bedding. In this paper, we present a new informed watermarking
system based on an embedding strategy that maximizes robustness
to perturbations such as compression and analog transmission. The
proposed system performance is then compared to the equivalent
blind embedding system performance. Experimental results per-
mit to assess the achieved efficiency. For transmission rates, up to
500 bits/s, BERs are divided by 10 when channel perturbation is
compression and divided by 4 in the case of analog transmission.
1. INTRODUCTION
Digital representation of audio signals has made data access easier
and fostered illegal data exchange possibilities. Copyright protec-
tion has therefore become a major issue. For several years, water-
marking techniques have been developed as a possible solution to
face this problem. Scientific studies pointed out new watermark-
ing application fields (broadcast monitoring or transactional wa-
termarking for example). Our main concern is the particular case
where audio signal can be viewed as a transmission channel, which
can embed binary information. The watermarking process is then
designed as a communication system with additive noise. The use-
ful information (the watermarking) is hidden by noise (the ”host”
audio signal). In this context, the watermark has to be robust to
classical distortions (which will be referred to as channel pertur-
bations) such as compression or digital-to-analog and analog-to-
digital conversion. The embedding strategy aims at transmitting
as much information as possible with the best reliability. Thus,
transmission rate (in bits per second) and Binary Error Rate (BER)
define system performance.
When choosing the embedding process, it is necessary to con-
ciliate perceptual distortion and watermarking detection constraints.
These constraints can be represented by two regions of the audio
signal space [1]. The embedder role consists in choosing a wa-
termark which lies in the intersection of these two regions. The
way of choosing the watermark depends on the required embed-
ding strategy. The most promising, pointed out by Cox in [2],
*
Thanks to ARTUS RNRT project for funding
(http://www.telecom.gouv.fr/rnrt/projets/res 01 37.htm).
exploits the similarity between an additive watermarking system
and a communication system with side information. Using the
a priori knowledge of the host signal for defining an embedding
strategy improves data hiding capacity. Indeed, Costa [3] proved
that the capacity of such a communication system is only depen-
dent on both channel perturbation and watermarking power values
but not on the host signal. Moreover, he proposed an embedding
scheme based on a structured codebook. Nevertheless, Costa’s em-
bedder implementation is limited by the codebook size. Current
researches aim at approaching this model by proposing structured
codebooks, designed with quantization processes or methods of
fast codebook search. Most work use theoretical channel capacity
as a design criterion as suggested by Costa, not taking into ac-
count that channel perturbation power may not be known during
the embedding process. We propose a watermarking system using
side-information and maximizing robustness to channel perturba-
tion without making any hypothesis on perturbation power. The
system is designed as a closed loop scheme introducing a local
copy of the detection process at the embedder to take into account
the knowledge of the audio signal. Its performance will be com-
pared to the equivalent blind embedding scheme.
The outline of the paper is the following. In section 2, audio
watermarking principles and the reference blind embedding sys-
tem (BSE) are described. In section 3, the closed loop watermark-
ing scheme and the embedding strategy maximizing robustness to
perturbation which will be referred to as ESMR are presented. Ex-
perimental results are given in section 4 which allows us to analyze
the impact of our embedding strategy on system performance.
2. AUDIO WATERMARKING SYSTEM PRINCIPLES
A reference audio watermarking system processing digital signals
was developed in [4]. It was designed as a communication system,
as shown in figure 1.
Source encoding process maps the hidden message into a se-
quence of L symbols {k
l
}
l=1..L
. Each symbol is chosen among
the set {1, ..., M } and codes for N
bs
= log
2
(M) binary digits.
The embedding process requires an embedding codebook S con-
taining M waveforms with length N : S = {s
m
}m=1..M. The
modulation interface maps each symbol k
l
into the k
l
-th code-
book waveform so that the modulated signal on each symbol in-
terval [(l − 1)N...lN − 1] is : v = s
k
l
. To satisfy the inaudi-
bility constraint the power spectral density of the embedded signal
should be lower than a masking threshold given by a PsychoA-
coustic Model (PAM). This threshold is to be taken into account
when designing the filter H(f ) with impulse response h(n). Fil-
IV - 357 0-7803-8484-9/04/$20.00 ©2004 IEEE ICASSP 2004
➠ ➡