BENFORD’S LAW IN IMAGE PROCESSING Fernando P´ erez-Gonz´ alez , Greg L. Heileman and Chaouki T. Abdallah Dept. Teor´ ıa de la Se ˜ nal y Comunicaciones, ETSI Telecom., Universidad de Vigo, 36200 Vigo, Spain; ECE Dept. University of New Mexico, Albuquerque, NM 87131, USA ABSTRACT We present a generalization of Benford’s law for the first signifi- cant digit. This generalization is based on keeping two terms of the Fourier expansion of the probability density function of the data in the modular logarithmic domain. We prove that images in the Dis- crete Cosine Transform domain closely follow this generalization. We use this property to propose an application in image steganaly- sis, namely, detecting that a given image carries a hidden message. Index TermsBenford’s law, DCT, Fourier series, setaganaly- sis, watermarking. 1. INTRODUCTION Benford’s law of “anomalous digits” was enunciated by General Electric’s physicist Frank L. Benford in 1938 [?], and predicts the frequency of appearance of the most significant digit (MSD) for a broad range of natural and artificial data. Given a number in decimal form, the MSD is simply the leading digit of the mantissa (assuming that the exponent is a power of 10); hence, the MSD cannot take the value 0. For instance, the MSD of 2.85 is 2, and the MSD of 0.0034 is 3. Benford’s law tells that the probability that the MSD take the value d ∈{1, 2, ··· , 9} is P (d) = log 10 (1 + 1/d) (1) Since Benford’s paper, many works have made significant con- tributions at both the fundamental and the application levels. It can be safely said that the underlying mechanisms that make Benford’s law hold in many useful situations are known; these will be briefly reviewed in Section ??. On the other hand, at a practical level, Ben- ford’s law has been shown to apply to half-life time of radioactive particles [?], financial data [?], regression coefficients, and many other types of data. Of particular interest for our purposes is the work by J.M. Jolion [?], who showed that Benford’s law holds rea- sonably well in gradient images and in pyramidal decompositions based on the Laplace transform. To the best of our knowledge, the only other work dealing with Benford’s law for images is due to E. Acebo and M. Sbert [?], who proposed the use of Benford’s law to determine whether synthetic images were generated using physically realistic methods, although the fact that many real images do not fol- low Benford’s law (see Section ??) puts this application in question. In this paper we show that while images in the “pixel” domain seem not to obey Benford’s law, the situation changes quite dramat- ically when they are transformed using the Discrete Cosine Trans- This work was partially funded by Xunta de Galicia under projects PGIDT04 TIC322013PR, PGIDT04 PXIC32202PM, and Competitive re- search units program Ref. 150/2006; MEC project DIPSTICK, reference TEC2004-02551/TCM, and European Commission through the IST Pro- gramme under Contract IST-2002-507932 ECRYPT. form (DCT). Furthermore, we will present a generalization of Ben- ford’s law, based on Fourier analysis, that leads to a much closer fit to the observed digits frequencies. We will also give a theoretical explanation of why images in the DCT domain satisfy the general- ized law; such explanation heavily relies on well known and thor- oughly tested statistical properties of DCT coefficients. Finally, we will hint at some possible applications in forensics, by showing how the Fourier-based formulation can be used to detect whether an im- age has been watermarked. 2. BACKGROUND In this Section we will recall some of the known properties that affect random variables in the context of Benford’s law. Property 1 A random variable X follows Benford’s law if the ran- dom variable Y = log 10 X mod 1 is uniform in [0, 1). A random variable satisfying this latter property is called strong Benford, while the domain where Y is defined is called the Benford domain. Property 2 (Scale invariance): Suppose that X follows Benford’s law; then the random variable Z = αX will follow Benford’s law for an arbitru ary α if only if X is strong Benford. Besides scale invariance, it can be shown that Benford’s law is also related to base invariance. In fact, it can be shown that a random variable is scale and base invariant if and only if it is strong Benford. Property 3 (Product of independent random variables): Let X be strong Benford, and let Y be another random variable indepedent of X. Then, the random variable Z = X · Y is strong Benford. The product interpretation connects Benford’s law to mixtures of random variables. Mixtures of random variables are relevant in im- age processing after the proof by Hjorungnes et al. [?] that a Lapla- cian distribution (often used to model the coefficients of a block-wise DCT transform) can be written as a mixture of Gaussians whose variance is controled by an exponential distribution. Thus, if fX(x) denotes a zero-mean unit-variance Gaussian pdf, the mixture takes the form fZ (z)= Z 0 fX(z|σ 2 )f σ 2 (σ 2 ) 2 (2) where f σ 2 (σ 2 ) is an exponential. Interestingly, mixtures of the gen- eral form given in (??) can be written in such a way that Prop- erty ?? can be straightforwardly applied. Indeed, the random vari- able Z whose pdf is fZ (z) is obtained through (??) can be written as Z = X · Σ, with Σ the random variable that controls the vari- ance. From here, it is immediate to conclude that if either X or Σ conform to Benford’s law, then Z will also do so. I - 405 1-4244-1437-7/07/$20.00 ©2007 IEEE ICIP 2007