Word Spotting for Handwritten Documents using Chamfer Distance and Dynamic Time Warping Raid Saabni Computer Science Department Ben-Gurion University Of the Negev, Israel Traingle R&D Center, Kafr Qarea, Israel saabni@cs.bgu.ac.il Jihad El-sana Computer Science Department Ben-Gurion University Of the Negev Beer Sheva, Israel el-sana@cs.bgu.ac.il A large amount of handwritten historical documents are located in libraries around the world. The desire to ac- cess, search, and explore these documents paves the way for a new age of knowledge sharing and promotes col- laboration and understanding between human societies. Currently, the indexes for these documents are generated manually, which is very tedious and time consuming. Re- sults produced by state of the art techniques, for con- verting complete images of handwritten documents into textual representations, are not yet sufficient. Therefore, word-spotting methods have been developed to archive and index images of handwritten documents in order to enable efficient searching within documents. In this pa- per, we present a new matching algorithm to be used in word-spotting tasks for historical Arabic documents. We present a novel algorithm based on the Chamfer Distance to compute the similarity between shapes of word-parts. Matching results are used to cluster images of Arabic word-parts into different classes using the Nearest Neigh- bor rule. To compute the distance between two word-part images, the algorithm subdivides each image into equal- sized slices (windows). A modified version of the Cham- fer Distance, incorporating geometric gradient features and distance transform data, is used as a similarity dis- tance between the different slices. Finally, the Dynamic Time Warping (DTW) algorithm is used to measure the distance between two images of word-parts. By using the DTW we enabled our system to cluster similar word-parts, even though they are transformed non-linearly due to the nature of handwriting. We tested our implementation of the presented methods using various documents in differ- ent writing styles, taken from Juma’a Al Majid Center - Dubai, and obtained encouraging results. Keywords: Word Spotting, Handwriting Recognition, Dynamic Time Warping, Chamfer Distance 1 Introduction Recent advances in imaging, storing, and network technology have paved the way for launching several projects designed to scan and digitize historical books and manuscripts. These projects aim to disseminate knowledge and provide access to rare documents and old manuscripts, which are kept in brick-and-mortar libraries around the world. The implications of exposing this fas- cinating heritage to the public are too obvious to enumer- ate. These documents are written in various languages and come from different regions; they discuss numerous sub- jects and topics; and were written over many centuries. In this work we concentrate on historical Arabic docu- ments, since this collection is very large and has attracted modest amounts of research attention. Between the sev- enth and fifteenth centuries a huge number of documents were written in Arabic in various subjects, ranging from science and philosophy, to individuals’ diaries. More than seven million titles have survived the years and are cur- rently available in museums, libraries, and private collec- tions around the world. Several projects have been initiated in recent years, aimed to digitize historical Arabic documents – [1,2,3Al- Azhar University, Alexandria library, Qatar heritage li- brary]. These projects demonstrate the importance and the need for developing efficient and accurate algorithms for indexing and searching within document images. Cur- rently, such indexes are built manually, which is a te- dious, expensive and very time-consuming task. There- fore, automating this task using word spotting and key- word searching algorithms is highly desirable. In this paper we introduce a word-spotting algorithm for handwritten documents including historical Arabic manuscripts using a novel approach for matching word- images. We assume the input for the proposed algorithm is a collection of binary images of handwritten text, of reasonable quality. This assumption is not made to boil the problem down to the simple case, but to work in ac- 1