Security of spread-spectrum-based data hiding Luis P´ erez-Freire, a Pierre Moulin b and Fernando P´ erez-Gonz´alez a a Signal Theory and Communications Dept., University of Vigo, 36310 Vigo, Spain {lpfreire, fperez}@gts.tsc.uvigo.es b Coord. Sci. Lab. & ECE Dept., University of Illinois at Urbana-Champaign 405 N. Mathews Ave., Urbana, IL 61801, USA moulin@ifp.uiuc.edu ABSTRACT This paper presents an information-theoretic analysis of security for data hiding methods based on spread spectrum. The security is quantified by means of the mutual information between the observed watermarked signals and the secret carrier (a.k.a. spreading vector) that conveys the watermark, a measure that can be used to bound the number of observations needed to estimate the carrier up to a certain accuracy. The main results of this paper permit to establish fundamental security limits for this kind of methods and to draw conclusions about the tradeoffs between robustness and security. Specifically, the impact of the dimensionality of the embedding function, the host rejection, and the embedding distortion in the security level is investigated, and in some cases explicitly quantified. 1. INTRODUCTION This paper considers the security of spread spectrum methods for data hiding from an information-theoretic perspective. The fundamentals of this approach to security assessment can be found in, 1 where the security of watermarking and data hiding methods is measured by quantifying the information about the secret key (or better to say, of the embedding parameters that depend on the secret key) that leaks from the observation of wa- termarked signals and the corresponding remaining uncertainty. The key assumptions are that the watermarker owns a secret key that he/she uses to watermark contents, and an attacker is able to gather several signals (ob- servations) that were watermarked with the same secret key. An additional assumption is that the parameters of the watermarking scheme are known, according to Kerckhoffs’ principle; hence, the attacker is only interested in disclosing the secret key. The information about the key provided by the observations is quantified by means of the Shannon’s mutual information, and the remaining uncertainty or equivocation about the key is measured by the differential entropy of the key conditioned on the observations, which can be straightforwardly related to the lowest attainable error in the estimation of the secret key. The number of observations needed to achieve a certain estimation accuracy can be regarded as the security level of the watermarking scheme. This approach has been used in 2 for analyzing the security of lattice DC-DM methods, and it is used here for developing a complete security analysis of spread spectrum methods, i.e., those methods that perform watermark embedding in a secret subspace by modulating a secret carrier with the symbols to be embedded. Thus, the secret carrier will play the role of secret parameter to be estimated. Specifically, three data hiding methods are considered: additive Spread Spectrum, 3 attenuated Spread Spectrum, 4 and Improved Spread Spectrum. 5 Three different scenarios for security assessment are studied, according to the classification given in 1 : 1. Known Message Attack (KMA): the attacker is assumed to have access to watermarked signals and the messages embedded in each of those signals. The interest in this scenario is mainly theoretical, since it constitutes the basis for the study of more involved scenarios and provides the main insight into the security problem. It is also useful for the study of security in watermark detection scenarios. This work was partially funded by Xunta de Galicia under projects PGIDT04 TIC322013PR and PGIDT04 PXIC32202PM; MEC project DIPSTICK, reference TEC2004-02551/TCM; European Commission through the IST Programme under Contract IST-2002-507932 ECRYPT, and Fundaci´on Caixa Galicia grant for postgraduate studies. ECRYPT disclaimer: The information in this paper is provided as is, and no guarantee or warranty is given or implied that the information is fit for any particular purpose. The user thereof uses the information at its sole risk and liability.