Adaptive Contour Classification of Comics Speech Balloons Christophe Rigaud 1,2(B ) , Dimosthenis Karatzas 2 , Jean-Christophe Burie 1 , and Jean-Marc Ogier 1 1 Laboratory L3i, University of La Rochelle, Avenue Michel Cr´ epeau, 17042 La Rochelle, France {christophe.rigaud,jcburie,jmogier}@univ-lr.fr 2 Computer Vision Center, Universitat Aut`onoma de Barcelona, 08193 Bellaterra (Barcelona), Spain dimos@cvc.uab.es Abstract. Comic books digitization combined with subsequent comic book understanding give rise to a variety of new applications, including content reflowing, mobile reading and multi-modal search. Document understanding in this domain is challenging as comics are semi-structured documents, with semantic information shared between the graphical and textual parts. Speech balloon contour analysis reveals the speech tone which is an essential step towards a fully automatic comics understand- ing. In this paper we present the first approach for classifying speech balloon in scanned comic books where we separate and analyze their contour variations to classify them as “smooth” (normal speech), “wavy” (thought) or “zigzag” (exclamation). The experiments show a global accuracy classification of 85.2 % on a wide variety of balloons from the eBDtheque dataset. Keywords: Image processing · Contour/shape separation · Contour classification 1 Introduction Comic books are a widespread cultural expression and are commonly accepted as the “ninth art”. Comics are a hybrid medium, combining textual and visual information in order to convey their narrative. Digitization combined with sub- sequent document understanding of comic books is therefore of interest, both in order to add value to existing paper-based comic heritage, but also to bridge the gap between the paper and electronic comic media. In comics content understanding, speech balloons (or speech bubbles) present a lot of interest since they are the link between the textual content and the person providing two major pieces of information, the location of the speaker (balloon tail), and the speech tone according to the different patterns which are along the contour of the balloon. If we are able to automatically determine c Springer-Verlag Berlin Heidelberg 2014 B. Lamiroy and J.-M. Ogier (Eds.): GREC 2013, LNCS 8746, pp. 53–62, 2014. DOI: 10.1007/978-3-662-44854-0 5