A Second Look at First Significant Digit Histogram Restoration Matthias Kirchner and Sujoy Chakraborty Department of Electrical and Computer Engineering Binghamton University Binghamton, NY 13902–6000 Email: {kirchner, schakra2}@binghamton.edu Abstract—We analyze a class of first significant digit (FSD) histogram restoration techniques designed to cover up traces of previous JPEG compressions under a minimum cost constraint. We argue that such minimal distortion mappings introduce strong artifacts to the distribution of DCT coefficients, which become particularly prevalent in the domain of second significant digits (SSDs). Empirical findings from large image databases give in- sight into SSD distributions of DCT coefficients of natural images and demonstrate how images that underwent FSD histogram restoration deviate from natural images. I. I NTRODUCTION Over the past decade, image forensics has matured to be- come a research field where advances are critically assessed under a security perspective more routinely. Embedded into the broader trend of adversarial signal processing [1], counter- forensics [2] subsumes attempts to systematically mislead forensic techniques. While early works were mostly of heuristic nature [3], there is now a growing body of theoretical works that contribute to the necessary rigorous foundations [4], [5]. Driven by the widespread use of JPEG images, the compres- sion history of digital images is among the most extensively studied subjects. A standard problem on the forensic side is the detection of previous JPEG compressions. Also counter- forensic methods to make JPEG images look like uncompressed ones or to hide traces of double compression abound [6], [7]. These developments have led to a new generation of forensic algorithms focusing on traces left by counter-forensics [8], [9]. This paper follows the fruitful path of interaction between forensics and counter-forensics. Our particular emphasis is on a recent class of counter-forensic techniques that restore first significant digit (FSD) histograms of block-DCT coefficients to cover up artifacts of previous JPEG compression(s) under a minimum distortion constraint [10], [11]. We demonstrate how optimal FSD histogram restoration has a strong impact on the distribution of second significant digits (SSDs), which may be exploited to detect FSD restoration. Along the way, we provide insights into SSD distributions of block-DCT modes of natural images, which appear to follow a Benford-like law in certain instances. In the remainder of this paper, Sect. II briefly reviews the current literature on FSD forensics and counter-forensics, before Sect. III discusses how SSDs of FSD-restored sequences differ from natural sequences. Sect. IV introduces some the- oretical background on SSD distributions. Sect. V describes our experimental setup to explore empirical aspects of SSD distributions and histogram restoration in Sects. VI and VII. Sect. VIII concludes the paper. II. FSD I MAGE FORENSICS AND COUNTER-FORENSICS A. First Significant Digits and Benford’s Law We follow the notation in [11] and write the first significant digit (FSD) of a non-zero number x R \{0} as d 1 (x)= |x| 10 log 10 |x|⌋ = 10 c(x) , (1) where c(x) = log 10 |x|−⌊log 10 |x|⌋ = log 10 |x| mod 1 is the coset representative of x in Benford’s domain. With bin boundaries b i = log 10 (i + 1) , (2) the i-th FSD histogram bin of a sequence x =(x 1 ,...,x N ), x k =0, is given as h 1 (x,i)= {x k : b i1 c(x k ) <b i } , 1 i 9 . (3) The FSD histogram of x is h 1 (x)=(h 1 (x, 1),..., h 1 (x, 9)). A sequence x is said to follow Benford’s law [12] if h 1 (x,i) N log 10 (1 + 1/i) . (4) A sufficient condition for Benford’s law to be satisfied is a uniform distribution of c(x) over the interval [0, 1) [13]. It is commonly assumed that block-DCT modes of natural images obey Benford’s law to some degree (amongst many other types of “natural” data), yet generalized forms of the law have also been proposed to be more compliant with empirical FSD distributions of DCT coefficients from uncompressed images [14] or quantized JPEG coefficients [15]. B. Image Forensics Based on First Significant Digits A number of forensic techniques work with assumptions about the distribution of first significant digits of block-DCT coefficients. Inference about the JPEG compression history of digital images is a typical application, for instance to determine whether a bitmap image had been JPEG compressed before, or whether a JPEG image underwent multiple compression cycles. The common working assumption of these methods is that lossy JPEG compression changes the FSD distribution