Suitability of Independent Component Analysis in Digital Image Forgery Detection Sunil Kumar #1 , J.V. Desai #2 , S. Mukherjee *3 , P. K. Das # # Faculty of Engineering & Technology Mody Institute of Technology & Science Lakshmangarh, India * Department of Electrical Engineering Indian Institute of Technology Roorkee, India 1 skvasistha@ieee.org 2 jagandesai@yahoo.com 3 mukherjee.shaktidev@gmail.com Abstract— Digital image forgery detection is one of the hot research area in the recent time. A lot of researchers are trying different tools to establish the authenticity of a given image. There can be many types of forgery performed on digital image. Watermarking is one of the traditional techniques to detect any type of tampering with the original image, but that has to be done at the time of capturing the image. Once an image has been captured without such technique, there is no alternate but to apply different blind forgery detection techniques. The present paper is an effort to explore Independent Component Analysis (ICA) as a tool to get clues about the tampering with original image. The results may provide further leads to the researchers working in the same area. Keyword- digital image forgery, ICA, image tampering, blind source separation I. INTRODUCTION Digital image forgery can be performed in today’s world very easily with the kind of software tools and ever improving hardware. It can be performed in different ways [1]. It may be as simple as inserting some object from other image to elude the viewer or as complicated as pasting part of the same image to hide some information after various translations and rotations on the cropped part. Broadly severe type of forgery can be classified in to splicing and copy-move attack. Recently there has been more emphasis on copy move attack and comparison and effectiveness of the popular methods have been reported in surveys like [2] and [3]. In spite of the many existing methods there is lack of inclusive approach to deal all kind of image tampering. In the noted trends of forgery more than one image are mixed and the resultant forged image is created. So essentially, if it can be established that the resultant image is a mixture of multiple images, then it can be termed as a forged image. Independent component analysis (ICA) is well known for its ability to separate the sources, in case the sources are statistically independent. As ICA has the ability to extract the independent components in mixed audio signal as in cocktail party problem [4], it can well be used for independent component extraction in images also. ICA is already being used in the areas of pattern recognition and medical imaging, but it has not been tried extensively to detect tampering in digital images. II. INDEPENDENT COMPONENT ANALYSIS Independent component analysis (ICA) is a statistical and computational technique for revealing hidden factors that underlie sets of random variables, measurements, or signals. ICA defines a generative model for the observed multivariate data, which is typically given as a large database of samples. In the model, the data variables are assumed to be linear mixtures of some unknown latent variables, and the mixing system is also unknown. The latent variables are assumed nongaussian and mutually independent and they are called independent components of the observed data. These independent components, also called sources or factors, can be found by ICA. The ICA algorithm was initially proposed in [5] to solve the blind source separation (BSS) problem i.e. given only mixtures of a set of underlying sources, the task is to separate the mixed signals and retrieve the original sources. Neither the mixing process nor the distribution of sources is known in the process. A simple mathematical representation of the ICA model [6] is as follows: Consider a simple linear model which consists of N sources of T samples i.e. Si = [Si(1), ...,Si(t),...,Si(T)]. The symbol ‘t’ represents time, but it may represent some other parameter like space. M weighted mixtures of the sources are observed as X, where Xi = [Xi(1),... ,Xi(t),... ,Xi(T)]. This can be represented as: X = A S + n Where X= (X 1 , X 2 , X 3 ………., X M ); S= (S 1 , S 2 , S 3 …., S N ) and n= (n 1 , n 2 , n 3 ………., n k ) where n represent the additive white Gaussian noise (AWGN). It is assumed that there are at least as many observations as sources i.e. M = N. The M × N matrix A is represented as Sunil Kumar et al. / International Journal of Engineering and Technology (IJET) ISSN : 0975-4024 Vol 5 No 1 Feb-Mar 2013 226