Abstract This work presents a complete method for improving the handwritten document recognition. In this task some characters are confused with others because of their visual/structural similarity. A SOM and TreeSOM neural network were used to sort different characters in metaclasses. In each metaclass a zoning approach was applied trying to get particular features to improve the character classification. The experiments with this new approach were performed in the NIST database with the classic MLP and a fast neural network RBF-DDA. I. INTRODUCTION HE document recognition is an important intelligent systems research area. Commercially, several companies use this technology to solve complex real world activities. Character recognition systems are increasingly powerful for printed characters, however much remains to be improved in the handwritten recognition task. Analyzing the automatic handwritten recognition problem the major issue is the visual/structural similarities among some characters, e.g. the letters “I” and “J”. The use of metaclass is a recognition approach to deals with this type of problem [1]. The metaclass approach builds clusters with similar characters and considers different characteristics using a local strategy to recognize each cluster of characters. The human strategy to character recognition task is similar to the metaclass approach, we first associates a well known part of the character and then a specialized recognition is used [2]. From this assumption many works have been proposed with a local processing strategy. The use of parts of the characters to extract some local features is called zoning. Some authors proposed empirical zoning [2] [3] and others an automatic zoning [4]. In this paper we propose a new approach to build metaclasses. The metaclasses were created by SOM neural network [6]. The SOM technique allows the creation of clusters containing elements with similar characteristics. To find the best SOM cluster map we used the evaluation technique treeSOM [7]. So the clusters were built according to the best possible cluster composition. V. Macário is with the Department of Estatistics and Informatics, Rural Federal of Pernambuco, Recife Pernambuco, 52171-900, Brazil, and with the Center of Informatics, Federal University of Pernambuco, Recife - Pernambuco, 50740-540, Brazil (e-mail: vmf2@cin.ufpe.br) G.F.P. Silva is with the Academic Unit of Garanhuns, Rural Federal of Pernambuco, Garanhuns - Pernambuco, 55292-270, Brazil (e-mail: gfps.ufrpe@gmail.com) M.R.P. Souza, C. Zanchettin and G.D.C Cavalcanti are with the Center of Informatics, Federal University of Pernambuco, Recife - Pernambuco, 50740-540, Brazil (e-mail: {mrps, cz, gdcc}@cin.ufpe.br). After the metaclasses formation, we have used a zoning mapping approach proposed by Freitas et al. [5] to differentiate characters at each cluster. For each character zoning 118 structural and directional features were extracted. To verify the performance of the proposed approach we used two classical neural networks classifiers: RBF-DDA [8] and MLP [9]. Different experiments were performed with the NIST database [15]: without using metaclasses and zoning, using metaclasses, using metaclasses and zoning, and using only the zoning approach. The next section presents the zoning approach. The section 3 details the used feature extraction techniques. Section 4 explains the metaclasses training. The experiments are presented in section 5 and the final remarks are in the last section. II. THE ZONING MECHANISM In text recognition the zoning approach is generally defined as the act of divide a standard complex text in several simple parts. So this complex pattern can be recognized by examining these generated simple patterns. With the zoning approach local and combination strategies can be used to simplify the text recognition. Suen et al. [2] and Li et al. [3] applied the zoning approach to handwritten characters classification. Four different configurations were investigated. In this approach each character is divided in a rectangle with Z parts, where Z = 2LR (L=left, R=right), 2UD (U=up, D=down), 4, and 6 zones. In Freitas et al. [5] the zoning approach was investigated using the zones Z = 4, 5H, 5V and 7, as presented in the Figure 1. In this paper we investigate the zoning mechanism (methods of decomposition) by regions. In each zone we Metaclasses and Zoning for Handwritten Document Recognition V. Macario, G.F.P. Silva, M.R.P. Souza, C. Zanchettin and G.D.C Cavalcanti T 2LR 2UD 4 5H 5V 7 Fig. 1. Z = 2LR, 2UD, 4C, 5H, 5V e 7. Proceedings of International Joint Conference on Neural Networks, Dallas, Texas, USA, August 4-9, 2013 978-1-4673-6129-3/13/$31.00 ©2013 IEEE 2505