Преглед НЦД 8 (2006), 29–35 Andrey Andreev, Nikolay Kirov (Institute of Mathematics and Informatics Bulgarian Academy of Sciences) WORD IMAGE MATCHING IN BULGARIAN HISTORICAL DOCUMENTS 1 Abstract: An approach to word image matching based on Hausdorff distance is examined for bad quality typewritten, printed or handwritten Bulgarian documents. A detailed computer experiments were carried out using 49 pages typewritten text, 13 pages printed text and 2 pages of a manuscript. The results of several methods are compared including previously reported methods in the literature. Keywords: document text image, bitmap file, word matching, Hausdorff distance 1. Introduction The Hausdorff distance used in the paper differs slightly from ones used by other authors and after the computer experiments the conclusion from the results is that our method outperforms them despite its simplicity. Let A and B denote bounded sets on the plane and a and b be points on the plane with coordinates a = (a 1 , a 2 ), b = (b 1 , b 2 ). The Hausdorff distance (HD) between two bounded sets A and B is defined in [4] for the purposes of approximation of discontinues functions as (1) r(A,B) = max{h(A,B), h(B,A)}, where (2) h(A,B) = max min p(a,b), aєA bєB (3) p(a,b) = max { |a 1 – b 1 |, |a 2 - b 2 | }. In 1994 Dubuisson and Jain [1] examined 24 distance measures of Hausdorff type for determination to what extend two point sets on the plane A and B differ. In case when the sets A and B consist of N A and N B points along with (3) changed to Euclidean distance they use 1 This research has been supported by a Marie Curie Fellowship of the European Community program “Knowledge Transfer for Digitalization of Cultural and Scientific Heritage in Bulgaria” under contract number MTKD-CT-2004- 509754. The work has been done while the authors were at the National Center of Scientific Research “Demokritos”, Athens, Greece.