ACADEMIA Letters
Lettrine Recognition in Ancient Documents
Nizar Zaghden, SETIT
Abstract
In this paper, we study the lettrines recognition in ancient documents, based on their
local and global characteristics. In a context of genericity, we suggest a generic procedure
consisted of 4 steps: frst the extraction of all the lettrines presented in the document.
Second the calculation of the features vectors while using diverse algorithms. Thirdly,
the classifcation of the lettrines within the classes of our basis of lettrines and fnally,
the search of the lettrines which are similar to our query image. Well, many tests have
been applied and they have shown the strength of our system in terms of classifcation,
likewise the correspondency between the request and obtained images by our system
classifcation.
Keywords: Old documents, lettrines, segmentation, extraction of features, classifcation,
SIFT, SURF, Wavelets
1. Introduction
In this paper, we are interested in the search of the image throughout its content. Also, we
present the diferent existing features to depict an image and its utility with the specifcity of
the old document [1, 2, 3]. The recognition algorithm suggested in this examination functions
throughout four steps: In fact, once we take the ancient document, a step of segmentation is
achieved for the purpose of composing the image in homogeneous regions. The result of
segmentation is the set of the segments which collectively covers the totality of the image or
the shapes extracted from the image. This step aims at extracting the lettrines, focusing on
the specifc features of this ornemental letter. Besides, we will deal with the extraction of
the features (feature vector) like: average, variety, energy, the characteristics of SIFT, when
Academia Letters, July 2021
Corresponding Author: Nizar Zaghden, nizar.zaghden@gmail.com
Citation: Zaghden, N. (2021). Lettrine Recognition in Ancient Documents. Academia Letters, Article 1586.
https://doi.org/10.20935/AL1586.
1
©2021 by the author — Open Access — Distributed under CC BY 4.0