Cover Signal Specific Steganalysis: the Impact of Training on the Example of two Selected Audio Steganalysis Approaches Christian Kraetzer (1) and Jana Dittmann (1) (1) Otto-von-Guericke University Magdeburg, Universitaetsplatz 2, D-39106, Magdeburg |{kraetzer,jana.dittmann}@iti.cs.uni-magdeburg.de ABSTRACT The main goals of this paper are to show the impact of the basic assumptions for the cover channel characteristics as well as the impact of different training/testing set generation strategies on the statistical detectability of exemplary chosen audio hiding approaches known from steganography and watermarking. Here we have selected exemplary five steganography algorithms and four watermarking algorithms. The channel characteristics for two different chosen audio cover channels (an application specific exemplary scenario of VoIP steganography and universal audio steganography) are formalised and their impact on decisions in the steganalysis process, especially on the strategies applied for training/ testing set generation, are shown. Following the assumptions on the cover channel characteristics either cover dependent or cover independent training and testing can be performed, using either correlated or non-correlated training and test sets. In comparison to previous work, additional frequency domain features are introduced for steganalysis and the performance (in terms of classification accuracy) of Bayesian classifiers and multinomial logistic regression models is compared with the results of SVM classification. We show that the newly implemented frequency domain features increase the classification accuracy achieved in SVM classification. Furthermore it is shown on the example of VoIP steganalysis that channel character specific evaluation performs better than tests without focus on a specific channel (i.e. universal steganalysis). A comparison of test results for cover dependent and independent training and testing shows that the latter performs better for all nine algorithms evaluated here and the used SVM based classifier. Keywords: Audio steganalysis, cover signal specific steganalysis 1. MOTIVATION AND INTRODUCTION Following a classification given by Kharrazi et al.1 steganalytical techniques can be grouped into two classes: specific and universal steganalysis techniques. What distinguishes both classes is their focus. While the first is specific to a particular steganographic technique, the latter is effective over a wide variety of techniques. In the work by Kharrazi et al. it is shown that the two classes of steganalytical techniques allow for different training and test set generation strategies which, as a consequence, might have an impact on the classification accuracy (the ratio of correctly classified instances) of the steganalysis process. Based on the considered classification of steganalytical techniques and previous work in both classes, in this paper a more abstract view on steganalysis is presented, focussing more on the considered cover channel characteristics than selected steganographic techniques. The main goals of this paper are the evaluation of the impact of modelling the channel characteristics in cover dependent training and the comparison of cover independent and dependent training and testing. This work is based on previous work in CD-quality universal audio and PCM based VoIP steganalysis. 2005 Dittmann et al. 2 introduced a steganalysis approach for VoIP applications and the notation of active and passive steganography. The VoIP steganography tool was prototypically implemented for the active steganography approach using a LSB embedding scheme. In the evaluations performed in this first paper the detectability of the scheme was measured using a steganalysis tool considering 13 audio signal features and comparing the results for original and stego files. In 2006 a description of a re-implementation and adaptation of the steganography approach was give by Vogel et al. 3 , now covering also the practical implementation of the passive steganography approach to perform steganalysis. Kraetzer et al. used this later version for different analyses on its perceptual 4, 5 and statistical 6, 7 detectability in comparison with other algorithms. Parallel to the VoIP steganography tool the analysis toolset AAST (AMSL Audio Steganalysis Toolset) was developed for the evaluation of the statistical detectability for audio steganography. Descriptions of this toolset and test results for its performance in universal audio steganalysis as well as for VoIP steganalysis are given by Kraetzer et al. 6, 7 . Based on these previous works a formalisation is given here for the considered channel characteristics for CD-quality universal audio steganalysis and PCM based VoIP steganalysis. Additionally the impact of channel characteristics to steganalysis performance and audio forensics introduced in Kraetzer et al. 13 is investigated by showing that their usage together with the newly introduced frequency domain features and the chosen classifier improves the classification accuracy for seven of the nine tested steganography algorithms and all of the tested microphone classifications. The paper is structured as follows: In section 2 the example cover channel characteristics used in this paper are formalised and the used data hiding algorithms and audio test sets are briefly introduced. Section 3 describes the two approaches of cover depending and cover independent training/testing. Section 4 introduces the used steganalysis process