Sliding window based approach for document image mosaicing P. Shivakumara a, * , G. Hemantha Kumar b , D.S. Guru b , P. Nagabhushan b a Department of Computer Science, School of Computing, National University of Singapore, 3 Science Drive 2, Singapore 117543, Singapore b Department of Studies in Computer Science Manasagangotri, University of Mysore, Mysore-570006, Karnataka, India Received 11 February 2004; received in revised form 19 September 2005; accepted 24 September 2005 Abstract There are situations where it is not possible to capture a large document with a given imaging media such as Scanner and Xerox machine in a single stretch because of their inherent limitations. This result in capturing a large document in terms of split components of an image. Hence, the need is to mosaic the split components into a large document image. In this paper, we present a new and simple approach to mosaic the two split images of a large document based on matching sum of values of pixels of window in the split images. The method compares the sum of values of pixels of window in split images to identify Overlapping Region (OLR) in the split images. The OLR, a region in common, helps in mosaicing of two split images of large document. However, a small OLR is assumed to be available at the end of split images of a large document. In addition to this, the OLR in the split images depends on the size of the window. Experimental results show that the performance of the proposed method is satisfactorily. q 2005 Elsevier B.V. All rights reserved. Keywords: Window matching; Overlapping region; Document image mosaicing 1. Introduction Many a times, it may not be possible to capture the complete image of a large document in a single exposure as most of he image capturing media works with documents of deﬁnite size. In such cases, the document has to be scanned part by part producing split images. Thus, the document image analysis and processing require mosaicing of split images to obtain a complete image of the document. Hence, the document image mosaicing is the process of merging split images that are obtained either by scanning or capturing a single large document image a part by part with some sort of OLR in order to restore a original and complete document image without any duplication of portions. Several researchers have proposed the methods for obtaining the large image from its split images. Schutte and Vossepoel (1995) described the usage of ﬂat bed scanner to capture large utility map [9]. The method selects the control points in different utility maps to ﬁnd the displacement required for shifting from one map to the next. These control points are found from pair of edges common to both the maps. However, the process requires human intervention to mask out the region not common to both the split images in image mosaicing. [1,2] have proposed method for Document Image Mosai- cing (DIM). A feature-based approach through estimation of the motion from point correspondence is proposed. The exhaustive search adopted was computationally expensive because of the rotation of an image employed during matching. In addition, the method demands 50% OLR in the split images to produce mosaiced image. However, the approaches are limited to only text documents and are prone to failure in case of general documents containing pictures. But in practice, a typical document contains both text and pictures. An automatic mosaicing process for split document images containing both texts and pictures, based on correlation technique is proposed by [3]. Here correlation technique was used to ﬁnd the position of the best match in the split images. However, accuracy is lost at the edges of the images. Moreover, the correlation of two images of practical size is computationally very expensive. In order to ﬁnd a solution, additional constraints like a priori knowledge were introduced. Here, the sequence in which the images were captured and their placement (generally, referred as image sequencing) is known. Template matching procedure was used to search OLRs, present in the split document images. Usually, template- matching procedure is a time consuming method. In addition, this approach assumes that the printed text lies on straight and Image and Vision Computing 24 (2006) 94–100 www.elsevier.com/locate/imavis 0262-8856/$ - see front matter q 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.imavis.2005.09.015 * Corresponding author. Tel.: C65 687 46806. E-mail addresses: hudempsk@yahoo.com (P. Shivakumara), shiva@comp. nus.edu.sg (P. Shivakumara).