Machine Vision and Applications (1997) 10: 9–16 Machine Vision and Applications c Springer-Verlag 1997 Adjacency matrix generation from the image of graphs: a morphological approach Amit K. Das 1 , Bhabatosh Chanda 2 1 Computer Science and Technology Department, Bengal Engineering College, Howrah 711 103,West Bengal, India e-mail: amit@becs.ernet.in / Fax: +91-033-660 4564 2 Department of Electrical Engineering, University of Washington, Seattle, Washington, USA Abstract. This paper presents a system for automatic gen- eration of the adjacency matrix from the image of graphs. The graph, we assume, is printed or hand printed and avail- able as a part of a document either separately or along with text and picture. A morphology-based approach is used here to separate components of the graphs: vertices, edges and labels. A novel technique is proposed to traverse the non- planar edges joining the vertices. The proposed method may be used for logical compression of the information contained in the graph image in the form of an adjacency matrix. It may also be used to replace the cumbersome, error-prone and time-consuming manual method of generation of the adjacency matrix for graphs with large number of vertices and complex interconnections. Key words: Document image analysis – Mathematical mor- phology – Compression of graph image – Adjacency matrix– Edge overrun 1 Introduction Mathematical morphology is popular as a tool for dealing with shapes of objects in an image. Morphological oper- ations can simplify image data while preserving their es- sential shape characteristics and eliminate irrelevancies. As the identification and decomposition of the objects correlate directly with the shape, it is natural that the mathematical morphology has an essential role to play in machine vi- sion (Serra 1982; Crimmins and Brown 1985; Haralick et al. 1987). Here we propose a method for automatic gener- ation of the adjacency matrix of a graph from its image. The graph, we assume, is a part of a document image and has three different components, namely vertices, edges and vertex names (labels). A document has primarily three differ- ent objects, i.e text, pictures and graphics. The graph, in our case, is a part of the graphics present in the document image and may be separated out using an available segmentation method (Fan et al. 1994). The proposed method may be used Correspondence to : A.K. Das to logically compress the information contained in a graph image and it may solve the problem of time-consuming and error-prone manual adjacency matrix generation. The pro- posed method is a new application of automated document image analysis, and the algorithm presented here may be used in applications like recognition/interpretation of engi- neering drawing/graphics. Existing publications report edge following in vectoriza- tion operation, where thinning is usually carried out to get single-pixel-width lines (Pavlidis 1986; Joseph 1989; Na- gasamy and Langrana 1990). Thinning may result in dis- continuity, and extra processing may be necessary to pre- serve the connectedness. Many thinning algorithms call for extensive noise cleaning (Arcelli and di Baja 1985) as a prerequisite, as noise severely degrades the efficiency and effectiveness of the process. After thinning the images may be processed to extract connected chains of pixels as de- scribed by Freeman and Davis (1977). This is effective for compression but for image analysis it is important to re- tain the complete line structure with all its branches and the topology of each junction. Therefore, further processing is necessary to preserve the branching and topological infor- mation. In Harris (1982) the skeleton is coded with pixel values 0, 1, 2 and 3 for background, end of line, intermedi- ate and junctions respectively. Boatto et al. (1992) reported a method to locate the forked points. A primitive chain code is cited by Kasturi et al. (1992) which preserves, besides con- nectivity, branching and topological information. A contour processing of edges, which avoids thinning, is described by Zhu-Jianxin et al. (1995), where elements of skeletal vectors and sub shapes are described by a G(V,E) graph to analyse the forked joints in the line drawings. However, direct edge overrun is not usually required for the vectorization problem as it has a different objective (Kanungo et al. 1995). The vectorization technique may also be used in adja- cency matrix generation. But that would incur excess com- putational cost due to thinning, extracting critical points and abstracting the thinned structures as straight-line segments and/or conic sections, etc. More importantly, since vector- ization techniques do not resolve the edge overrun problem, postprocessing such as analysis of sequence of abstracted structures is inevitable. As a result, from the compressed