Security of spread-spectrum-based data hiding Luis P´ erez-Freire, a Pierre Moulin b and Fernando P´ erez-Gonz´alez a a Signal Theory and Communications Dept., University of Vigo, 36310 Vigo, Spain {lpfreire, fperez}@gts.tsc.uvigo.es b Coord. Sci. Lab. & ECE Dept., University of Illinois at Urbana-Champaign 405 N. Mathews Ave., Urbana, IL 61801, USA moulin@ifp.uiuc.edu ABSTRACT This paper presents an information-theoretic analysis of security for data hiding methods based on spread spectrum. The security is quantiﬁed by means of the mutual information between the observed watermarked signals and the secret carrier (a.k.a. spreading vector) that conveys the watermark, a measure that can be used to bound the number of observations needed to estimate the carrier up to a certain accuracy. The main results of this paper permit to establish fundamental security limits for this kind of methods and to draw conclusions about the tradeoﬀs between robustness and security. Speciﬁcally, the impact of the dimensionality of the embedding function, the host rejection, and the embedding distortion in the security level is investigated, and in some cases explicitly quantiﬁed. 1. INTRODUCTION This paper considers the security of spread spectrum methods for data hiding from an information-theoretic perspective. The fundamentals of this approach to security assessment can be found in, 1 where the security of watermarking and data hiding methods is measured by quantifying the information about the secret key (or better to say, of the embedding parameters that depend on the secret key) that leaks from the observation of wa- termarked signals and the corresponding remaining uncertainty. The key assumptions are that the watermarker owns a secret key that he/she uses to watermark contents, and an attacker is able to gather several signals (ob- servations) that were watermarked with the same secret key. An additional assumption is that the parameters of the watermarking scheme are known, according to Kerckhoﬀs’ principle; hence, the attacker is only interested in disclosing the secret key. The information about the key provided by the observations is quantiﬁed by means of the Shannon’s mutual information, and the remaining uncertainty or equivocation about the key is measured by the diﬀerential entropy of the key conditioned on the observations, which can be straightforwardly related to the lowest attainable error in the estimation of the secret key. The number of observations needed to achieve a certain estimation accuracy can be regarded as the security level of the watermarking scheme. This approach has been used in 2 for analyzing the security of lattice DC-DM methods, and it is used here for developing a complete security analysis of spread spectrum methods, i.e., those methods that perform watermark embedding in a secret subspace by modulating a secret carrier with the symbols to be embedded. Thus, the secret carrier will play the role of secret parameter to be estimated. Speciﬁcally, three data hiding methods are considered: additive Spread Spectrum, 3 attenuated Spread Spectrum, 4 and Improved Spread Spectrum. 5 Three diﬀerent scenarios for security assessment are studied, according to the classiﬁcation given in 1 : 1. Known Message Attack (KMA): the attacker is assumed to have access to watermarked signals and the messages embedded in each of those signals. The interest in this scenario is mainly theoretical, since it constitutes the basis for the study of more involved scenarios and provides the main insight into the security problem. It is also useful for the study of security in watermark detection scenarios. This work was partially funded by Xunta de Galicia under projects PGIDT04 TIC322013PR and PGIDT04 PXIC32202PM; MEC project DIPSTICK, reference TEC2004-02551/TCM; European Commission through the IST Programme under Contract IST-2002-507932 ECRYPT, and Fundaci´on Caixa Galicia grant for postgraduate studies. ECRYPT disclaimer: The information in this paper is provided as is, and no guarantee or warranty is given or implied that the information is ﬁt for any particular purpose. The user thereof uses the information at its sole risk and liability.