IJDAR (2017) 20:173–187
DOI 10.1007/s10032-017-0289-3
ORIGINAL PAPER
On writer identification for Arabic historical manuscripts
Abedelkadir Asi
1
· Alaa Abdalhaleem
1
· Daniel Fecker
2
· Volker Märgner
2
·
Jihad El-Sana
1
Received: 2 May 2017 / Revised: 24 July 2017 / Accepted: 25 July 2017 / Published online: 1 August 2017
© Springer-Verlag GmbH Germany 2017
Abstract This paper introduces new methodologies for reli-
ably identifying writers of Arabic historical manuscripts. We
propose an approach that transforms key point-based fea-
tures, such as SIFT, into a global form that captures high-level
characteristics of writing styles. We suggest a modification
for a common local feature, the contour direction feature,
and show the contribution of combining local and global fea-
tures for writer identification. Our work also presents a novel
algorithm that determines the number of writers involved
in writing a given manuscript. The experimental study
confirms the significant improvement in this algorithm on
writer identification once applied to historical manuscripts.
Comprehensive experiments using different features and
classification schemes demonstrate the vitality of the sug-
gested methodologies for reliable writer identification. The
presented techniques were evaluated on both historical and
modern documents where the suggested features yielded very
promising results with respect to state-of-the-art features.
B Alaa Abdalhaleem
alaaabd@cs.bgu.ac.il
Abedelkadir Asi
abedas@cs.bgu.ac.il
Daniel Fecker
Fecker@ifn.ing.tu-bs.de
Volker Märgner
maergner@ifn.ing.tu-bs.de
Jihad El-Sana
el-sana@cs.bgu.ac.il
1
Department of Computer Science, Ben-Gurion University of
the Negev, Beersheba, Israel
2
Institute for Communications Technology, Technische
Universität Braunschweig, Brunswick, Germany
Keywords Writer identification · Writer retrieval · Key
point-based features · Contour-based features · Supervised
learning · Hierarchical clustering · Classification
1 Introduction
Identifying the writer of a handwritten document is an emerg-
ing research problem that has been receiving significant
interest in recent years. It poses interesting research chal-
lenges for document examiners, especially for historical
handwritten documents. Paleographers invest a consider-
able amount of time to recognize the writer of a questioned
manuscript. This explains the acute demand for developing
an automatic system for document writer recognition that can
scale up to handle the huge amount of digital manuscripts.
Such systems provide a list of suspected writers to human
experts who still have the main role in determining the indi-
viduality of a handwriting.
Given a dataset of known writers in a reference dataset, the
writer identification task aims to assign one of these writers
to a query document image. Writer retrieval task aims to
retrieve the document images, out of a set of documents,
written by the writer of the query document. It is important
to mention that in these tasks a writer is represented by the
writing style. In essence, we are identifying and retrieving
writing styles and not necessarily writers. However, to stay
inline with previous works we use the common terminology
from the literature.
Recently, a unique challenge for writer recognition in
historical manuscripts has emerged. Researchers noticed a
writing technique, known as the staggering technique, where
different scribes write the same document to induce a one-
writer illusion [3]. The staggering technique might seriously
distort the performance of automatic writer recognition sys-
123