The Small World of Software Reverse Engineering Ahmed E. Hassan and Richard C. Holt Software Architecture Group (SWAG) School of Computer Science University of Waterloo Waterloo, Canada {aeehassa,holt}@plg.uwaterloo.ca ABSTRACT Research in maintenance and reengineering has flourished and evolved into a central part of software engineering re- search worldwide. In this paper, we have a look at this re- search community through the publications of its members in several international conferences. We analyze our results using various graph and text mining techniques. We contrast our findings to other research communities. 1 INTRODUCTION Publications in a research community give a picture of the progress of collaboration and emergence of topics in an ac- tive research field. The authorship details on each publica- tions represent a social network of collaboration between re- searchers in the community. One would expect a high de- gree of collaboration in an academic community, in contrast to a lower degree of collaboration in commercial commu- nities. Furthermore, the titles of these publications permit us to track the appearance of new research topics and areas of interest in the community and the computer industry as a whole. Such topics of interest may in some cases explain changes in the collaboration structure of a community and may shed some light on its evolution. DBLP [2] tracks the publication history for several confer- ences in the areas of reengineering, maintenance and soft- ware engineering in general. The data is available as an XML file. It records for each year the title of the publications and the authors of these publications. The availability of this data has encouraged us to study the structure of collaboration and the evolution of areas of interest in the reengineering com- munity as part of the larger community of software engineer- ing research. We examine the publications produced by researchers in the areas of software maintenance and reengineering in several international conferences. We develop a social collaboration network for the community using the co-authorship data for these conferences. In particular, we build a graph that has as nodes each author who published in these conferences. An edge exists between two nodes if they co-authored a paper together. Such a graph is shown in Figure 1. The figure was built using the co-authorship data for the Working Confer- ence on Reverse Engineering (WCRE) from 1993 through 2002 inclusive. The size of each node in the graph is pro- Legend: 1: E. Burd 2: E. Stroulia 3: M. Munro 4: M. Harman 5: E. Merlo 6: G. Canfora 7: K. Kontoginannis 8: R. Koschke 9: R. Holt Figure 1: Co-Authorship Graph for WCRE (1993-2002) portional to the number of publications by the node (author). Also weights were added to the edges to indicate the num- ber of papers that two authors have written together. The layout of the figure was generated using a force based algo- rithm [5]. Thus author nodes in the layout are closer to other author nodes with which they interact the most. In the up- per left corner of Figure 1, we show an overview of the full graph of co-authorship. The graph contains a single large connected component along with many smaller components that vary in size. In the main pane of the figure, we zoom to the center of the largest connected component and mark the author’s names for some of the large nodes in it 1 . Figure 2 shows the variation of the size of the largest compo- nents in the WCRE co-authorship graph from 1993 to 2002. For 2002, the largest component contains around 29% of all authors that ever published a paper in WCRE. The next largest component has always been considerably smaller - for 2002, it contains around 3% of all the authors. The years 1999 and 2000 saw large increases in the size of the largest component. This is due to the fact that a number of authors 1 A more interactive view of the figure is available online as an SVG or GDL at: http://plg.uwaterloo.ca/˜aeehassa/home/pubs/wcreCoauthorsGraph.html. The graph has recently been chosen as the graph of the month and is acces- sible at http://www.aisee.com/graph of the month/wcre.html.