S.K. Pal et al. (Eds.): PReMI 2005, LNCS 3776, pp. 459 464, 2005. © Springer-Verlag Berlin Heidelberg 2005 A Chosen Plaintext Steganalysis of Hide4PGP V 2.0 Debasis Mazumdar 1 , Soma Mitra 1 , Sonali Dhali 1 , and Sankar K. Pal 2 1 CDAC,Kolkata, Plot E2/1, Block – GP, Sector V, Salt Lake City, Kolkata – 700091 2 Machine Intelligence Unit, Indian Statistical Institute, Kolkata - 700108, India Abstract. A chosen plaintext steganalysis algorithm is described to isolate the corrupted bits in an image tampered with Hide4PGP V 2.0. The method is developed from the notion of representation of two dimensional image data in terms of a linear bit stream consisting of a set of basic building blocks. Its performance for message extraction is demonstrated on different 24 bit BMP images. 1 Introduction Hide4PGP V 2.0 is a steganography tool, which is readily available and used frequently to tamper both image and audio files. It encrypts the message before distribution; thereby making the tool robust. Some schemes are reported in this regard based on statistical measures [1] and pallet color [2]. First and higher order statistics are also used to discriminate between images with and without hidden messages using LSB embedding stego tools like Hide4PGP ([3],[4]). M.K.Johnson et. al. [5] has used frequency domain analysis for audio signals to detect the existence of hidden message embedded using Hide4PGP. However, these methods can only detect the presence or absence of hidden message within a cover. In the present article, we describe a steganalysis algorithm based on chosen plaintext attacking scheme, which can identify and isolate the corrupted bits. The isolation of corrupted bits helps us to get back the ciphertext. To reconstruct the plaintext message out of the ciphertext one may use further cryptanalysis. The algorithm uses a unique method of representing two dimensional image data as a linear bit string consisting of a set of basic building blocks. The effectiveness of the method is demonstrated on a set of 24 bit BMP images of varying sizes. 2 Characteristics of Message Distribution After a series of experiments using monotonic (single color) cover image (having p no. of pixels) and known plaintext we found the following characteristics (patterns) of the distribution used in embedding a message in 3p number of LSB positions. i) The message, of length m L , is embedded symmetrically within the LSBs of R, G and B components of p number of pixels. These LSBs, when arranged in a linear string, shows reflection symmetry with respect to centre position. Also in each part the arrangement is same when read from left to right or vice-versa. These are