Published in IET Information Security Received on 16th June 2009 doi: 10.1049/iet-ifs.2010.0100 ISSN 1751-8709 Application of fuzzy logic and genetic algorithm in biometric text-independent writer identiﬁcation S.M. Saad Department of Information Technology, Institute of Graduate Studies and Research, University of Alexandria, 163 El-Horreya Avenue, El Shatby 21526, P.O. Box 832, Alexandria, Egypt E-mail: saad.darwish@gmail.com Abstract: The identiﬁcation of a person on the basis of scanned images of handwriting is a useful biometric technique with application in forensic document analysis. This study describes the design and implementation of a system that identiﬁes the writer using ofﬂine Arabic handwritten text. The key point is using multiple features to capture different aspects of handwriting individuality and to operate at different level of analysis with the aim of improving identiﬁcation performance. Fuzzy logic (FL) and genetic algorithm (GA) have been used in a complementary fashion to fuse (combine) extracted features as well as to deal with the ambiguity of human judgment of handwritings similarity. GA is used to help construct and tune fuzzy membership functions that are necessary to categorise the strength of existence of handwritings features similarity through FL, with the purpose of yielding high correct identiﬁcation rates. The ﬁnal results indicate and clarify that the proposed system achieves an excellent test accuracy of identiﬁcation rated up to 96% for Arabic text. 1 Introduction With the increase in use of computers in every aspect of life, automatic person identiﬁcation is becoming an important problem and receives growing interest from both academia and industry. Identiﬁcation of a person can be implemented by many methods, including most commonly used password or by making use of either the biometric static features of the person (e.g. ﬁngerprint, face, iris pattern) or biometric dynamic features (e.g. handwriting, voice) [1]. Writer identiﬁcation, the task of determining the writer from his handwriting, is such a technique that satisﬁes four requirements of personal identiﬁcation: accessible, cheap, reliable and acceptable [2, 3]. Consequently, in spite of existence of other biometric techniques, it appears that the writer identiﬁcation still remains an attractive application. Handwriting-based personal identiﬁcation has a wide variety of potential applications, from security, forensics, ﬁnancial activities to archaeology (identify ancient document writers). Research into writer identiﬁcation has been focused on two streams: online and ofﬂine writer identiﬁcation [1, 2]. The former assumes that a transducer device (tablet digitiser) is connected to the computer, which can convert writing movement into a sequence of signals. Online handwriting allows us to use velocity, pressure and spatial information along with pen-up and pen-down events, which are not available with ofﬂine data. As a result, online writer identiﬁcation, compared to the ofﬂine one, is easier to achieve high identiﬁcation accuracy [3]. Unfortunately, online systems are inapplicable in many cases (e.g. archaeology). Therefore developing effective techniques on ofﬂine system is a vital task. In comparison, ofﬂine systems have been studied as tools that deal with handwritings scanned into a computer ﬁle in two-dimensional (2D) image representation. These systems are based on the use of computer image processing and pattern recognition techniques to solve the different types of problems encountered: pre-processing, feature extraction and selection, samples comparison and performance evaluation [4]. Within ofﬂine identiﬁcation systems, depending on how the writer identiﬁcation is implemented with regard to the registered writer’s samples, the writer identiﬁcation system can be divided into two approaches, namely text-dependent and text-independent [1, 3, 5] approaches. Text-dependent approach requires handwriting based on a speciﬁc text (signature). Commonly, the geometry or structure features of those given characters/words are extracted as the writing features. The major problem with text-dependent systems is that they are not applicable to cases where the text is not available, such as in criminal justice systems when text documents with different content need to be compared. Second, text-dependent systems are more prone to forgery (replay attack) as same data are presented for testing. On the other hand, text-independent systems model the style information, independent of the content and can identify the writer based on any given text. This usually requires the use of statistics of features computed from a large quantity of data to avoid anomalies due to speciﬁc text. One of the main difﬁculties associated with the writer identiﬁcation task is to deﬁne a set of features able to reﬂect the large variability observed in different samples of script from the same writer over time or from different scriptors. There is no ideal mathematical model that can IET Inf. Secur., 2011, Vol. 5, Iss. 1, pp. 1–9 1 doi: 10.1049/iet-ifs.2010.0100 & The Institution of Engineering and Technology 2011 www.ietdl.org