Segmentation Based Recovery of Arbitrarily Warped Document Images B. Gatos, I. Pratikakis and K. Ntirogiannis Computational Intelligence Laboratory, Institute of Informatics and Telecommunications, National Center for Scientific Research “Demokritos”, GR-153 10 Agia Paraskevi, Athens, Greece http://www.iit.demokritos.gr/~bgat/, {bgat,ipratika}@iit.demokritos.gr Abstract Non-linear warping appears in document images when captured by a digital camera or a scanner, especially in the case that these documents are digitized bounded volumes. Arbitrarily warped documents may have several slope changes along the text lines as well as along the words of the same text line. In this paper, a novel segmentation based technique for efficient restoration of arbitrarily warped document images is presented. The proposed technique recovers the documents relying upon (i) text lines and words detection using a novel segmentation technique appropriate for warped documents, (ii) a first draft binary image de-warping based on word rotation and translation according to upper and lower word baselines, and (iii) a recovery of the original warped image guided by the draft binary image de-warping result. Experimental results on several arbitrarily warped documents prove the effectiveness of the proposed technique. 1. Introduction Document image acquisition by a digital camera or a flatbed scanner often results into several image distortions. Non-linear warping is a major distortion that occurs especially when the scanned documents are bounded volumes (see Fig. 1a). Warping not only diminishes document’s readability but also reduces the accuracy of an OCR application. Several techniques have been proposed for correcting the document image warping that can be classified in two main categories: (i) 2D image processing techniques ([1], [2], [3], [4], [5]) and (ii) techniques on 3D document shape reconstruction ([6], [7], [8]). Our work is related to the first category of techniques since the second category requires image capture with special camera setup as well as document surface representation by using a 3D shape model. Approaches of the first category have been reported by several authors. In [1], a deformable system to straighten curved text image is presented. Restoration is accomplished by using an active contour network based on an analytical model with cubic B-splines which have been proved more accurate than Bezier curves. A model fitting technique has also been proposed using cubic splines to define the warping model of the document image [2]. For more accurate de-warping, a vertical division of a document image into some partial document images is also suggested. Another model fitting technique [3] divides the document image into shaded and non-shaded region and then uses polynomial regression to model the warped text lines with quadratic reference curves. In [4], the texture of a document image is calculated so as to infer the document structure distortion. A mesh of the warped image is built using a non-linear curve for each text line. The curves are fitted to text lines by tracking the character boxes on the text lines. The erroneously fitted curves are detected and excluded by a post processing based on several heuristics. The approach of [5] relies on a priori layout information and is based on a line-by-line de-warping of the observed paper surface. Each letter in the input image is enclosed within a quadrilateral cell, which is then mapped to a rectangle of correct size and position in the result image. In order to recover arbitrarily warped gray scale document images, we propose a novel technique that is based on (i) text lines and words detection using a novel segmentation technique appropriate for warped documents, (ii) a first draft binary image de-warping based on word rotation and translation according to upper and lower word baselines, and (iii) a recovery of the original warped image guided by the draft binary image de-warping result. The remaining of this paper is structured as follows: In Section 2, we detail the proposed approach. Our experimental results are described in Section 3, while in Section 4, conclusions are drawn.