Assessing and Improving the Quality of Document Images Acquired with Portable Digital Cameras Rafael Dueire Lins, Gabriel Pereira e Silva, André Ricardson Gomes e Silva Departamento de Eletrônica e Sistemas – Universidade Federal de Pernambuco - Brazil rdl@ufpe.br, gfps@cin.ufpe.br, andrericardson@yahoo.com.br Abstract Professionals and students of many different areas start to use portable digital cameras to take photos of documents, instead of photocopying them. This article analyses the quality of such documents for Optical Character Recognition and proposes ways of improving their transcription and readability. 1. Motivation The fast growth on image quality of portable digital cameras together with a drastic price reduction was widened its applicability into unforeseen domains. One of them is using portable digital cameras for digitalizing documents. Students and professionals of many different areas now use those devices as a fast way to acquire document images, taking advantage of their low weight, portability, low cost, small dimensions, etc. This new research area [1][2] is evolving fast in many different directions and claims for new algorithms, tools and processing environments that are able to provide users in general with simple ways of visualizing, printing, transcribing, compressing, storing and transmitting through networks such images. Reference [3] points out some particular problems that arise in this document digitalization process: the first of all is background removal. Very often the document photograph goes beyond the document size and incorporates parts of the area that served as mechanical support for taking the photo of the document. The second problem is due to the skew often found in the image in relation to the photograph axes, as documents have no fixed mechanical support very often there is some degree of inclination in the document image. The third problem is non-frontal perspective, due to the same reasons that give rise to skew. A fourth problem is caused by the distortion of the lens of the camera. This means that the perspective distortion is not a straight line but a convex line, depending on the quality of the lens and the relative position of the camera and the document. The fifth difficulty in processing document images acquired with portable cameras is due to non-uniform illumination. This paper focuses on assessing the output of a commercially OCR (Optical Character Recognition) software for such documents and follows the steps pointed out in [3] to improve their transcription and readability. 2. Assessment methodology Assessing image quality is a task of uttermost complexity. Although the human visual-neural system is an extremely sophisticated, subjectivity plays an import role in image recognition and choice of quality, thus it should be avoided by every means. In this paper the assessment methodology was limited to analyze the performance of commercial OCR tools. Omnipage Professional 15.0 from Nuance [4] was used, because it is possibly the best general purpose available today. This study compares the results obtained by transcribing a batch of 50 documents which were scanned with a HP scanner (model 3200c) with 100, 150, and 300 dpi resolution in true color with the results obtained by transcribing the same documents with the cameras of 3.2 and 4.1 Mega pixels. These results are later used to assess the gains obtained with the documents after each processing step. On its turn, analyzing the results of OCRs is far from being a trivial task. The methodology presented in reference [5] which takes into account the nature of the errors in transcription was adopted here. The errors were classified according to: 1. Character errors (character replacement) 2. Missing characters. 3. Character insertion. 4. Graphical accents errors. 5. Word errors (number of incorrect words) 6. Word missing (complete words not transcribed) 7. Punctuation errors. Words are lexemes with at least three characters, avoiding stopwords and isolated characters.