A new method for perspective correction of document images Jos´ e Rodr´ ıguez-Pi˜ neiro a , Pedro Comesa˜ na-Alfaro a,b , Fernando P´ erez-Gonz´alez a,b,c and Alberto Malvido-Garc´ ıa d a University of Vigo, Signal Theory and Communications Dept., Vigo, Spain b University of New Mexico, Electrical and Computer Engineering Dept., Albuquerque, NM c Gradiant (Galician Research and Development Center in Advanced Telecommunications), Vigo, Spain d Bit Oceans Research, L´opez de Neira, 3, Oﬃce 408, 36202 Vigo, Spain ABSTRACT In this paper we propose a method for perspective distortion correction of rectangular documents. This scheme exploits the orthogonality of the document edges, allowing to recover the aspect ratio of the original document. The results obtained after correcting the perspective of several document images captured with a mobile phone are compared with those achieved by digitizing the same documents with several scanner models. Keywords: Document image, perspective distortion correction 1. INTRODUCTION Nowadays, the increasing performance and low price of portable imaging devices (e.g. mobile phones, PDAs) are boosting the usage of these devices for supplementing or even replacing traditional ﬂat-bed scanners for document image adquisition. Unfortunately, a number of problems that traditional scanners do not have to face have arisen with this increasing use, with perspective distortion one of the most evident and probably most harmful for the subsequent application of document processing tools. Although this problem has already deserved some attention in previous works in the literature, most of the proposed solutions are based on rather restrictive assumptions on the nature of the captured document, so the target of this work is to follow a systematic approach with a minimum number of constraints. Indeed, the only constraint imposed on our method is that the four corners of the document are captured in the considered image. Next, we review some of the most representative methods related to perspective distortion correction. These methods can be classiﬁed into two main categories. The methods in the ﬁrst category use the text in the document for characterizing perspective distortion. The second category encompasses those algorithms that do not require that the original document includes text. A deeper discussion about both is provided below. 1.1 Methods requiring text in the document Clark and Mirmehdi 1 proposed a perspective distortion correction method based on the use of vanishing points recovery, where these points provide the information required to correct the perspective distortion of the captured document. In this work, the recovery of the vanishing points is based on the assumption that a text paragraph must display some sort of left, right, centered or full formatting. Probably, its main drawbacks are the com- putational cost required by some steps of the correction method —including several image transformations and exhaustive searches on some parameters, such as the horizontal vanishing point— and the need of knowing some correspondences between imaged points and their real-world counterparts. A diﬀerent approach is followed by Lu et al., 2 where a method based on applying morphological operators is proposed. This method needs neither high-contrast document boundaries nor paragraph formatting information. Nevertheless, it is constrained to deal with text documents, and, even more importantly, it is based on the use of some parameters that require much knowledge about the document contents, like the number of characters Further author information: (Send correspondence to P. C.-A.) P. C.-A.: E-mail: pcomesan@gts.uvigo.es, Telephone: +34 986818655