Pattern Recognition Letters 129 (2020) 137–143
Contents lists available at ScienceDirect
Pattern Recognition Letters
journal homepage: www.elsevier.com/locate/patrec
An end-to-end deep learning system for medieval writer
identification
N.D. Cilia
∗
, C. De Stefano , F. Fontanella , C. Marrocco , M. Molinara , A. Scotto Di Freca
Department of Electrical and Information Engineering, University of Cassino and Southern Lazio, Via Di Biasio 43, 03043 Cassino (FR), Italy
a r t i c l e i n f o
Article history:
Received 13 August 2019
Revised 13 November 2019
Accepted 18 November 2019
Available online 19 November 2019
MSC:
41A05
41A10
65D05
65D17
Keywords:
Deep learning
Transfer learning
Writer identification
Row detection
Avila bible
Digital paleography
a b s t r a c t
This paper presents an end-to-end system to identify writers in medieval manuscripts. The proposed
system consists in a three-step model for detection and classification of lines in the manuscript and
page writer identification. The first two steps are based on deep neural networks trained with transfer
learning techniques and specialized to solve the task in hand. The third stage is a weighted majority vote
row-decision combiner that assigns to each page a writer. The main goal of this paper is to study the
applicability of deep learning in this context when a relatively small training dataset is available. We
tested our system with several state-of-the-art deep architectures on a digitized manuscript known as
the Avila Bible, using only 9.6% of the total pages for training. Our approach proves to be very effective
in identifying page writers, reaching a peak of 96.48% of accuracy and 96.56% of F1 score.
© 2019 Elsevier B.V. All rights reserved.
1. Introduction
Paleography is the study of ancient and medieval handwrit-
ing. An important problem faced by paleographers is to identify
the writers, a.k.a. scribes, who contributed to the drawing up of a
manuscript. Traditionally, paleographers perform qualitative evalu-
ations to distinguish the writers, and in recent years, these tech-
niques have been joined by computer-based tools [1] to measure
quantities automatically such as height and width of letters, dis-
tances between characters, inclination angles, number and types of
abbreviations, etc. Recently emerged approaches in digital paleog-
raphy combine powerful machine learning algorithms with high-
quality digital images of medieval manuscripts. However, tradi-
tional techniques require a preliminary feature engineering step
that involves an expert in the field, thus increasing the application
development cost.
In recent years, deep-learning-based approaches have received
increasing attention from researchers thanks to their ability to
handle complex and difficult image classification tasks [2]. Deep
Handled by Associate Editor: G. Sanniti di Baja, Ph.D.
∗
Corresponding author.
E-mail address: nicoledalia.cilia@unicas.it (N.D. Cilia).
neural networks are capable of learning hierarchical feature repre-
sentations directly from data, instead of using handcrafted features
based on domain-specific knowledge [3]. Nonetheless, very few
studies applied deep learning techniques to the interpretation of
medieval manuscripts, and previous approaches were mainly used
for identifying sundry elements of interest inside document pages,
but not with the specific focus on writer recognition.
In our previous paper [4], we presented preliminary results of
a study in which deep neural networks were employed for the
identification of the scribes in ancient documents. For this aim,
we proposed a deep transfer learning solution for row detection
and page classification obtaining very encouraging results that en-
abled us to extend the previous approach and develop an end-
to-end system for writer recognition. The proposed approach is
based on three steps intended (i) to detect the lines (a.k.a. rows)
in each page of the manuscript, (ii) to classify them, and (iii) to
recognize the writer of the entire page. The first step consists in
a deep-learning-based object detector trained in transfer learning
on a generic dataset (like MS-COCO [5]) and specialized to solve
the task in hand. The second step is a row classifier composed
of a fully convolutional feature extractor and a meta-architecture
classifier that can be trained both from scratch and in fine tuning.
The third stage is a weighted majority vote row-decision combiner
https://doi.org/10.1016/j.patrec.2019.11.025
0167-8655/© 2019 Elsevier B.V. All rights reserved.