TagPies: Comparative Visualization of Textual Data Stefan J¨ anicke 1 , Judith Blumenstein 2 , Michaela R ¨ ucker 2 , Dirk Zeckzer 1 and Gerik Scheuermann 1 1 Image and Signal Processing Group, Leipzig University, Leipzig, Germany 2 Faculty of History, Arts and Oriental Studies, Leipzig University, Leipzig, Germany Keywords: Tag Clouds, Pie Charts, TagPies, Text Visualization,Text Comparison, Digital Humanities. Abstract: A TagPie is a novel tag cloud layout that arranges the tags belonging to multiple data categories in a pie chart manner. Motivated from research in classical philology, TagPies were designed to support the comparative analysis of classical terminology. In this scenario, the data categories represent the co-occurrences of different searched keywords, so that the comparison of the contexts in which these keywords were used becomes pos- sible using TagPies. This paper illustrates the iterative development of TagPies, which aid as a distant reading view on a text corpus for humanities scholars. We outline various steps of our collaborative digital humanities project, and we emphasize the utility of the proposed design by outlining various usage scenarios representing current research questions in classical philology. 1 MOTIVATION Traditionally, humanities scholars read texts on pa- per in order to generate and verify hypotheses about precisely formulated research questions. As a re- sult of mass digitization, nowadays, the scholars have access to large digital libraries containing numerous texts. This on demand availability of texts changes the traditional workflows of the scholars in different ways. First, the retrieval of text passages gets eas- ier, usually, by querying a text corpus using a typi- cal keyword-based search. The drawback of this ap- proach is that the quality of results is usually not sat- isfying. Often, the humanities scholars receive too many results, which they cannot process individually. Consequently, it is impossible to generate useful hy- potheses. On the other hand, the precision can be low so that many found text passages are irrelevant to the given research question. Especially in that case, picking text passages related to the observed topic is a laborious task. Second, the access to vast tex- tual data brought forth new research methodologies in the humanities, introduced by Franco Moretti as dis- tant reading (Moretti, 2005). Before the digital age it was inconceivable to generate hypotheses about texts without explicitly reading them; Moretti presented re- search questions that were impossible to investigate with the traditional close reading technique. In our digital humanities projecteXChange, 1 the 1 http://exchange-projekt.de/ collaborating humanities scholars—six historians and classical philologists—wanted to explore medical concepts in classical texts, which required workflows that include distant as well as close readings. Work- ing with the project’s large text corpus, the humanities scholars are interested in the co-occurrences of med- ical terms. For instance, they look for terms describ- ing medical conditions, associated terms for symp- toms, body parts, etc., in order to explore what ancient writers knew about the medical concept. The mis- sion of the corresponding digital humanities project was investigating novel research questions in classi- cal philology—the comparison of medical concepts. For instance, a humanities scholar hypothesized that the terms morbus comitialis and morbus sacer like- wise were used to denote epilepsy (a discussion of this example can be found in Section 5.2). In order to support the comparative analysis of medical concepts in classical texts, we developed Tag- Pies in close collaboration to the humanities scholars of our project. This paper outlines the steps of the iter- ative development and the final TagPies layout algo- rithm that includes a tailored tag sorting mechanism and design features applied to visually separate shared and individual contexts of terms. To meet the needs of the humanities scholars, we embedded TagPies as a distant reading visualization into a visual interface that is linked to a close reading view, which enable the inspection of individual text passages in order to as- sess their relevancy to the observed medical concept. 40 Jänicke, S., Blumenstein, J., Rücker, M., Zeckzer, D. and Scheuermann, G. TagPies: Comparative Visualization of Textual Data. DOI: 10.5220/0006548000400051 In Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2018) - Volume 3: IVAPP, pages 40-51 ISBN: 978-989-758-289-9 Copyright © 2018 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved