TagPies: Comparative Visualization of Textual Data
Stefan J¨ anicke
1
, Judith Blumenstein
2
, Michaela R ¨ ucker
2
, Dirk Zeckzer
1
and Gerik Scheuermann
1
1
Image and Signal Processing Group, Leipzig University, Leipzig, Germany
2
Faculty of History, Arts and Oriental Studies, Leipzig University, Leipzig, Germany
Keywords: Tag Clouds, Pie Charts, TagPies, Text Visualization,Text Comparison, Digital Humanities.
Abstract: A TagPie is a novel tag cloud layout that arranges the tags belonging to multiple data categories in a pie chart
manner. Motivated from research in classical philology, TagPies were designed to support the comparative
analysis of classical terminology. In this scenario, the data categories represent the co-occurrences of different
searched keywords, so that the comparison of the contexts in which these keywords were used becomes pos-
sible using TagPies. This paper illustrates the iterative development of TagPies, which aid as a distant reading
view on a text corpus for humanities scholars. We outline various steps of our collaborative digital humanities
project, and we emphasize the utility of the proposed design by outlining various usage scenarios representing
current research questions in classical philology.
1 MOTIVATION
Traditionally, humanities scholars read texts on pa-
per in order to generate and verify hypotheses about
precisely formulated research questions. As a re-
sult of mass digitization, nowadays, the scholars have
access to large digital libraries containing numerous
texts. This on demand availability of texts changes
the traditional workflows of the scholars in different
ways. First, the retrieval of text passages gets eas-
ier, usually, by querying a text corpus using a typi-
cal keyword-based search. The drawback of this ap-
proach is that the quality of results is usually not sat-
isfying. Often, the humanities scholars receive too
many results, which they cannot process individually.
Consequently, it is impossible to generate useful hy-
potheses. On the other hand, the precision can be
low so that many found text passages are irrelevant to
the given research question. Especially in that case,
picking text passages related to the observed topic
is a laborious task. Second, the access to vast tex-
tual data brought forth new research methodologies in
the humanities, introduced by Franco Moretti as dis-
tant reading (Moretti, 2005). Before the digital age it
was inconceivable to generate hypotheses about texts
without explicitly reading them; Moretti presented re-
search questions that were impossible to investigate
with the traditional close reading technique.
In our digital humanities projecteXChange,
1
the
1
http://exchange-projekt.de/
collaborating humanities scholars—six historians and
classical philologists—wanted to explore medical
concepts in classical texts, which required workflows
that include distant as well as close readings. Work-
ing with the project’s large text corpus, the humanities
scholars are interested in the co-occurrences of med-
ical terms. For instance, they look for terms describ-
ing medical conditions, associated terms for symp-
toms, body parts, etc., in order to explore what ancient
writers knew about the medical concept. The mis-
sion of the corresponding digital humanities project
was investigating novel research questions in classi-
cal philology—the comparison of medical concepts.
For instance, a humanities scholar hypothesized that
the terms morbus comitialis and morbus sacer like-
wise were used to denote epilepsy (a discussion of this
example can be found in Section 5.2).
In order to support the comparative analysis of
medical concepts in classical texts, we developed Tag-
Pies in close collaboration to the humanities scholars
of our project. This paper outlines the steps of the iter-
ative development and the final TagPies layout algo-
rithm that includes a tailored tag sorting mechanism
and design features applied to visually separate shared
and individual contexts of terms. To meet the needs
of the humanities scholars, we embedded TagPies as
a distant reading visualization into a visual interface
that is linked to a close reading view, which enable the
inspection of individual text passages in order to as-
sess their relevancy to the observed medical concept.
40
Jänicke, S., Blumenstein, J., Rücker, M., Zeckzer, D. and Scheuermann, G.
TagPies: Comparative Visualization of Textual Data.
DOI: 10.5220/0006548000400051
In Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018) - Volume 3: IVAPP, pages 40-51
ISBN: 978-989-758-289-9
Copyright © 2018 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved