A HYBRID RECOGNITION METHOD FOR DOCUMENT IMAGES ZHANG Yudong, WU Lenan, WANG Shuihua School of Information Science and Engineering, Southeast University No 2, Sipailou, Nanjing 210096 China zhangyudongnuaa@gmail.com ABSTRACT In order to improve the performance of document image recognition, a GLCM-based classifier is firstly proposed and the shortcomings are analyzed. Then, new features based on Rectangular frame histogram (RFH) are presented and added to the feature set. The hybrid classifier performs better than GLCM-based classifier in terms of classification and false-alarm ratio. KEY WORDS document image, gray-level co-occurrence matrix 1. Introduction Document image (DI) is defined as the ensemble of different types of texts, tables, and flow diagrams [1]. They are usually used in electronic government affair, legislative affairs, educational instruction, etc. As the popularization of image input devices such as scanner, digital camera, digital video, the document images on the web are exceeding incremental, the analysis and recognition of which have been a main research subject. Since the blocks of texts in DI have a distinctive visual appearance, they can be considered as a texture [2]. Therefore, the problem of telling apart DI and non-document image (NDI) may be tackled by means of the texture classification. Text feature descriptors can be classified into two categories according to the order of the statistical function utilized: first-order texture features and second-order texture features. The first-order texture features, also known as gray-level distribution moments (GDM), are extracted exclusively from the information provided by the intensity histogram, thus it yields no information about the locations of the pixels. The second-order texture features take into account the specific position of a pixel relative to another. The most popularly used of second-order methods is the gray-level co-occurrence matrix (GLCM) method, which depends on constructing matrices by counting the number of occurrences of pixel pairs of given intensities at given displacement [3]. In this paper, GLCM used for classifying the DI & NDI is firstly investigated, and we found that only GLCM is deficient since some special NDIs, such as pen-drawing composed of plentiful of strokes, hold the same GLCM-based features as DIs. In order to improve the classifications, a novel feature based on rectangular frame histogram (RFH) is proposed. It combines the features of both GLCM and RFH, and shows higher classification ratio and lower false-alarm ratio. 2. GLCM of DIs & NDIs The GLCM is a square matrix whose elements correspond to the relative frequency of occurrence of pairs of gray level values of pixels separated by a certain distance in a given direction [4]. Formally, the elements of a GLCM G(i,j) for a displacement vector (a,b) is defined as (, ) |{( , ),( , ): (,) & (,) }| Gij xy tv Irs i Itv j = = = (1) Where (t,v)=(x+a, y+b), and || is the cardinality of a set. The displacement vector (a,b) can be rewritten as (d, θ) in polar coordinates. GLCMs are suggested to calculate from four displacement vectors with d=1 and θ=0 ◦ , 45 ◦ , 90 ◦ , and 135 ◦ respectively. In this study, the (a,b) are chosen as (0,1), (-1,1), (-1,0), and (-1, -1) respectively, and the corresponding GLCMs are averaged. Fig. 1 shows some representative NDIs and their GLCMs. It is obvious from Fig. 1 that GLCM of NDIs are nearly diagonal and decrease rapidly off the diagonal line. (a) Peppers and its GLCM 665-002 62 Proceedings of the Conference - , 2009 November 2 4 Cambridge, MA, USA ( 2009) IASTED International Twelfth Intelligent Systems and Control ISC