52 ISSN 1392124X (print), ISSN 2335884X (online) INFORMATION TECHNOLOGY AND CONTROL, 2016, T. 45, Nr. 1 Text Line Segmentation with Parametric Water Flow Algorithm Darko Brodić University of Belgrade, Technical Faculty in Bor, Vojske Jugoslavije 12, 19210 Bor, Serbia, e-mail: dbrodic@tf.bor.ac.rs Zoran N. Milivojević College of Applied Technical Sciences, Aleksandra Medvedeva 20, 18000 Niš, Serbia http://dx.doi.org/10.5755/j01.itc.45.1.11197 Abstract. The paper proposes an extension to the original water flow algorithm used for the text line segmentation in a document image. This extension is called a parametric water flow algorithm. The original algorithm assumes that the hypothetical water flows to the barrier representing objects. After that barrier, the hypothetical water creates different pathways under few specified angles creating labeled wetted and unwetted regions. The direction of the hypothetical water is from left to right side and vice versa. Hence, the final labeled wetted and unwetted image regions are created as their unions. The unwetted regions are used to segment text lines in a document image. The extension of the original water flow algorithm establishes the so-called water flow function, which is responsible for the unwetted region's creation. The proposed linear water flow function is exchanged with its parametric function counterpart. The basic, linear and parametric water flow algorithms are tested and evaluated under different synthetic and handwritten text samples. The experiments show promising results in the area of text line segmentation. Keywords: Document image analysis; Image processing; Optical character recognition; Text line segmentation; Water flow algorithms. 1. Introduction Text line segmentation is a process of labeling in a document image. In this way, it assigns the same label to spatially aligned units [1]. Accordingly, the document image is split into different region groups. One region group represents a text, which is segmented into different areas. It represents the text lines. The text line segmentation of the printed documents has been assumed as a solved problem [2]. However, the text line segmentation of handwritten documents remains an open research field [3-4]. It represents a challenge in document image under- standing, analysis and processing [1, 3-4]. Many techniques for the text line segmentation have been developed. They are usually classified into the following groups [1]: (i) projection profile's methods, (ii) Hough transforms methods, (iii) smearing methods, (iv) grouping methods, (v) methods for processing overlapping and touching components, (vi) stochastic method, and (vii) other methods. The projection profile's methods have been used in printed documents. However, it can be used for the handwritten documents if it is adapted. These methods can exploit either horizontal, vertical or both projection profiles. Calculation of the projection profile means summing the pixel values along the horizontal or vertical axis. Because the original image is rotated in the specified range of angles, the projection profile is obtained for each angle of the image. Hence, it forms a function usually called a cost function. In this function, each local maximum represents the separated text line [5-6]. In contrast, a local minimum corresponds tothe interline space. Still, the use of these methods has obstacles in the multi - skew text documents, because the finding of maximum and minimum is not so easy. Furthermore, the short text lines also create a problem, because they provide low peaks [1, 5]. The Hough-transform methods [7] are used for finding straight lines in images [8]. The Hough- transform converts the initial document image into the Hough domain. Potential alignments are hypothesized