Constrained Energy Maximization and Self-Referencing method for Invisible Ink Detection from Multispectral Historical Document Images Rachid Hedjam * , Mohamed Cheriet † and Margaret Kalacska * * Department of Geography, McGill University, 805 Sherbrooke Street West, Montreal, QC H3A 2K6, Canada Email: {rachid.hedjam; margaret.kalacska}@mcgill.ca † Department of Automated Manufacturing Engineering, ´ ETS, University of Qu´ ebec 1100, Notre-Dame Street West, Montral, Qubec H3C 1K3, Canada Email: mohamed.cheriet@etsmtl.ca Abstract—This article deals with a serious form of degradation that often affects the readability of historical document images: the invisibility of text or ink. Due to wear over long periods of storage, the ink may become invisible to the human eye, an undesirable situation for scholars (i.e. Indian Ocean World project (IOW 1 , with whom we are working closely). Because only the class of ink is known a priori (reference), it can be considered as a target to be detected. This can be achieved by designing a linear filter that maximizes an energy function while minimizing the false detection of document image background elements. For each document image in which the ink is targeted, an internal reference is defined by a new self-referencing strategy. The proposed method is compared with a state-of-the-art methods, and validated on samples of real historical document images. Keywords—Historical document image enhancement, multi- spectral document imaging, target detection, ink detection, con- strained energy minimization, self-referencing. I. I NTRODUCTION Historical documents are of great interest to scholars in the social sciences and humanities, especially those focusing on subjects or phenomena that have developed over the course of human history. Historical documents are the memory of human cultures, their history, their achievements, their lifestyles and their individual and social behaviors. Therefore, the preserva- tion of these documents has captured the attention of many archives around the world, and the value of these historical documents is vastly improved for the consultation by the general public and for research purposes through digitization, which involves acquisition, processing and dissemination of knowledge. Likewise, digital archives allow for new per- spectives for research in the humanities aiming to provide a physical and logical analysis of such documents with the ultimate objective of improving our understanding of them. This is considered an important aid to scholars interested in dating historical documents, reading the old historical writing they contain and establishing their origins. However, digital historical documents offer specific difficulties impeding access to their content, e.g. the presence of degradation caused by environmental conditions; dust, dirt, humidity, etc. (see Fig. 1). The digital image processing of historical documents is one of 1 http://indianoceanworldcentre.com/ (a) (b) Fig. 1. Samples of degraded historical document images. the important low-level tasks that provide a better quality of data that allows for improved interpretation and understanding of the content. Several document image processing techniques have been proposed: thresholding-based [1], [2], [3], [4]; classification-based [5], [6], [7]; entropy information-based [8], [9]; multiscale-based [10]; variational-based [11]; learning- based [12], etc. In this paper, we address a specific type of severe degrada- tion that hinders the readability of the historical documents: the disappearance of ink or when historical texts become invisible or nearly invisible to the human eye (see Fig. 1). Due to wear over long periods in storage, ink may disappear and therefore becomes invisible to the human eye. This problem makes the deciphering of such texts very difficult and negatively affects the understanding of historical documents, an undesirable situation for scholars. An example is shown in Fig. 1(a). The text in the upper left corner of the document image, is invisible to the human eye. The sample is selected from MS- HISTODOC dataset [13]. We note here that very few samples are found for the validation of our proposed method. Our main challenge is access to historical documents to collect additional samples of this kind of degradation to generalize our