DATA-HIDING IN AUDIO USING FREQUENCY-SELECTIVE PHASE ALTERATION Rashid Ansari, Hafiz Malik, Ashfaq Khokhar Dept. of Electrical and Computer Engineering University of Illinois at Chicago, Illinois, USA ABSTRACT A novel perception-based data hiding technique for digital audio is proposed. It exploits lower sensitivity of human auditory system (HAS) to phase distortion in audio compared with magnitude distortion. Audio is decomposed into subband signals some of which are selected for embedding data with a controlled alteration of phase using suitable allpass digital filters. The proposed scheme is robust to standard data manipulations yielding less than 2% error probability against compression, re-sampling, re-quantization, random chopping and noise addition. The proposed method is also robust to the desynchronization attacks. 1. INTRODUCTION Digital media production and distribution has witnessed dramatic growth in recent years. The availability of the Internet, low-cost and reliable storage devices, and high- speed networks has made it possible to replicate and distribute digital information easily. This has created a need for protection and enforcement of intellectual property rights for digital media to prevent its illegal copying/reproduction. The need has spurred research in data hiding in digital media due to its applications in watermarking, annotation, and steganography [2]. The development of data-hiding methods requires many design and quality tradeoffs. An important requirement is that embedded data should be imperceptible. In addition data-embedding should be robust to standard digital data manipulations such as lossy compression, noise addition, and sampling rate conversion. Moreover, embedded data should be tamperproof against any adversary attacks. Perception-based data hiding schemes for audio are influenced by properties of the human auditory system (HAS). In the past, different perception-based algorithms were proposed for data hiding/watermarking in audio data [4 -10]. These algorithms can be broadly classified according to the underlying technique of data embedding: perceptual masking [3, 4, 5, 9], direct sequence spread spectrum (DSSS) [3,4], and phase coding [3,7,8]. Algorithms based on phase coding [3, 7, 8] work well as far as imperceptibility of the embedded data is concerned, but suffer from some limitations e.g. some of them do not perform well against standard data manipulations and most of them carry small payloads i.e. the amount of information embedded. For example, the phase coding technique proposed in [3] can embed only 16-32 bits of data in one-second duration audio samples. The algorithm based on echo-based coding [6] can embed about 40-50 bits of data in one-second duration of an audio signal. In this paper, a novel method is proposed to exploit the HAS property that human auditory perception is largely insensitive to audio phase distortion [1] in a certain range of audible frequencies. In this method audio is decomposed into subband signals some of which are selected for embedding data with a controlled alteration of phase. The full audible frequency range i.e. 20 Hz~ 20 kHz is unsuitable for such data embedding. In the higher frequency range it is hard to detect phase changes reliably. The higher frequency range contains insignificant signal energy and perceptual auditory model based compression techniques, such as MP3 [11], generally discard information in these frequency bands due to perceptually inaudible distortion. Moreover, low signal energy makes detection susceptible to error due to additive noise in the higher frequency range. In order to make embedded data more robust and resilient to standard data manipulations, the frequency range should be carefully selected to ensure imperceptibility as well as robustness of the embedded information. In our proposed method, a frequency range suitable for data hiding is first selected. The signal content in this range is partitioned into subbands using discrete wavelet packet analysis filter bank (DWPA-FB). Next the data is embedded in audio by modifying its phase in selected frequency bands using suitable allpass digital filters (APF) [1]. For synchronization, input audio data is analyzed to extract salient points characterized by fast tonal transitions. These salient points are attack sensitive regions too [9]. Therefore, synchronization based on these salient points can withstand desynchronization attacks such as random sample chopping, or intentional attacks such as randomly added or deleted samples from audio with embedded data. The frequency-selective phase alteration (FS-PA) technique can reliably embed more than 1000 bits of data in an audio segment of one-second duration, which is 10-15 times more compared with the existing methods. The proposed technique is robust to standard data manipulations yielding less than 2% error probability for lossy compression, noise addition, etc.