AN IMAGE SIGNATURE FOR ANY KIND OF IMAGE H. Chi Wong, Marshall Bern, and David Goldberg Xerox Palo Alto Research Center 3333 Coyote Hill Rd., Palo Alto, CA 94304 ABSTRACT We describe an algorithm for computing an image signature, suitable for first-stage screening for duplicate images. Our signature relies on relative brightness of image regions, and is generally applicable to photographs, text documents, and line art. We give experimental results on the sensitivity and robustness of signatures for actual image collections, and also results on the robustness of signatures under transformations such as resizing, rescanning, and compression. 1. BACKGROUND AND MOTIVATION Massive image databases are becoming increasingly com- mon. Examples include document image databases such as declassified government documents [8], and photo archives such as the New York Times archive. Duplicate removal of- fers space and bandwidth savings, and more user-friendly search results. Despite some effort to cull duplicates, the im- age search service of Google [4] often retrieves a number of duplicate and near-duplicate images. Duplicate detection also finds application in copyright protection and authenti- cation. We believe that duplicate detection is most effectively solved in two distinct stages. A fast first stage reduces im- ages to small signatures, with the property that signatures of two different versions of the same image have small vec- tor distance relative to the typical distance between signa- tures of distinct images. A slow second stage then makes a detailed comparison of candidate duplicates identified in the first stage. Detailed comparison of document images can identify changes as small as moving a decimal point [16]. In this paper we give a fast and simple algorithm for the first stage. Our image signature encodes the relative bright- ness of different regions of the image; it can be applied quite generally to text document images, line art (such as cartoons), and continuous-tone images. Although there are a number of image signature schemes already in the literature, there is no one signature that applies to such a wide class of im- ages. The main limitation of our signature is that it is not de- signed to handle large amounts of cropping or rotation. This design choice is appropriate for document image databases and Web search, but not for object recognition or for detect- ing copyright violations. 2. PREVIOUS WORK Image signatures have already been used to address three dif- ferent, but closely related, problems: similarity search, au- thentication, and duplicate detection. The three problems have slightly different characteristics that affect the design of the signature. Signatures for similarity search [10] (for exam- ple, color histograms) must preserve similarity, not just iden- tity, and hence do not necessarily spread out non-duplicates very effectively. Signatures for authentication [1, 11, 13, 15] must be more sensitive—but can be much slower—than sig- natures for the first stage of duplicate detection. Nevertheless, the techniques developed for search and authentication can be harnessed for duplicate detection. Our work adapts a method by O’Gorman and Rabinovich [11] for computing a signature for ID card photos. Their method places an grid of points on the ID card photo and com- putes a vector of 8 bits for each point in the grid; roughly speaking, each bit records whether a neighborhood of that point is darker or lighter than a neighborhood of its 8 diag- onal or orthogonal neighbors. In the usage scenario, the im- age signature computed from the photo is compared with a reference signature written on the ID card and digitally signed with public-key cryptography. Another technique worth considering lets the image con- tent dictate the neighborhoods used in the signature [1, 14]. Schmid and Mohr [14] use a corner-point detector to define “interesting points” in the image. The signature then includes a statistical abstract of the neighborhood of each interesting point. Compared to grid-based methods such as ours, this approach has the advantage that it can handle large amounts of cropping, and if the statistics at each interesting point are rotation invariant, it can also handle arbitrary amounts of ro- tation. On the other hand, it seems to take several hundred interesting points to obtain a reliable signature, and this ap- proach is likely to break down entirely for text documents or line art images, which may contain thousands of interesting points with very similar statistics. Also available in the literature are specialized signature