International Journal of Computer Applications (0975 – 8887) Volume 52– No.12, August 2012 38 Scientific Co-authorship Social Networks: A Case Study of Computer Science Scenario in India Tasleem Arif Department of IT, BGSB University Rajouri, J&K, India. Rashid Ali College of Computers and IT, Taif University, Taif, Saudi Arabia M. Asger School of Mathematical Sc. & Engg. BGSB University Rajouri, J&K, India. ABSTRACT Co-authorship is one of the most tangible and well documented forms of research collaboration. Data mining techniques and social network analysis can be used to extract and study these collaborations. Social network analysis provides an insight into the connections between groups of individuals. It is these connections that channel flow of information and the sharing of knowledge. In order to understand flow of information and interpret collaboration, co-authorship can be used as a measure to study intra and inter organization collaborations. In this paper, we analyze the collaboration scenario in Computer Science in India, and access how researchers in few of the best Indian Institutes of Technology (IITs) collaborate and relate to each other. We construct and visualize scientific co-authorship social network graphs of these institutions. We also compare and contrast network metrics for these institutes and experimentally deduce that these networks like other social networks exhibit “small world” properties. General Terms Data Mining, Web Mining, Social Network Analysis Keywords Co-authorship Networks, Visualization, IIT 1. INTRODUCTION A social network is a structured representation of the social actors (nodes) and their interconnections (ties). Such a network can be represented as a set of points (or vertices) denoting people, joined in pairs by lines (or edges) denoting acquaintance. Social networks form social groups that share common interests. These groups are steadily emerging on the Web and the demand for forming an on demand social network is immense. Community members profit from being linked to other people sharing common interests, though having geographically dispersed affiliations. One could, in principle, construct the social network for a company or firm, for a school or university, or for any other community up to and including the entire world. Extraction and visualization of social relations can benefit many applications in areas like crime and terrorism prevention, organizational network analysis, customer interactions, connections and communities. Understanding the graph structure of these networks can benefit many applications in various diversified fields. Adjacent users in a social network tend to trust each other and mostly have common interests. Users normally find the content of their interest in their neighboring regions. It would be useful to have efficient algorithms to infer the actual degree of shared interest between two users and trust or reliability enjoyed by a user among other users in the network. Sharing of knowledge and interaction between researchers is well known to be the essence of research practice and collaboration. Collaboration is defined as “working jointly with others or together especially in an intellectual endeavor ” [1]. Researchers interact not only to communicate research activities but also to collaborate with each other to coproduce research and co-author research results. Since collaboration has the potential to promote research activity, productivity and impact, it should be encouraged, supported and monitored. Although it has been argued that co-authorship is no more than a partial indicator of collaboration, studies indicate that it is the highest measure of collaboration [2]. In several studies, for instance in [3], it has been shown that there is a positive correlation between collaboration and co-authorship. In fact, co-authorship is one of the most tangible and documented forms of research collaboration [4]. These collaborations or connections form a social network, and in order to understand their effect, they need to be viewed from a network perspective. Social network analysis (SNA) focuses on the relationships among social entities, and on the patterns and implications of these relationships, and allows us to examine those patterns in a structural manner [5]. SNA can be used to discover underlying social structure such as: central nodes that act as hubs, leaders or gatekeepers; highly connected groups; and patterns of interactions between groups [5]. SNA has been used to study social interaction in a wide range of domains. Examples include: collaboration networks [6], directors of companies [7], inter-organizational relations [8], and many others. Social Network Analysis examines relationships between social entities i.e. people, groups, teams, tasks, beliefs, knowledge, etc. These entities are modeled with nodes and their relationships are modeled with links between the nodes. Not all nodes in the network are connected and some nodes may have multiple connections. This mathematical model is applicable in many content areas such as communications, information flow, and group and organizational affiliations [5]. SNA looks at groups of people and their interactions. This type of analysis provides a methodology that does a very good job at explaining much of the complex behavior of these social groups. SNA thus relies heavily on graph theory to model network structure. Although there are other forms of academic collaborations but this paper defines collaboration as jointly co-authoring a paper and shows the use and evaluation of our approach on the identification of scientific co-authorship relationship. We use publications data to extract social networks of researchers. From the publication data, it is possible to know various attributes of a researcher like his research interests, collaborations, and even conferences attended recently. The publication data can be retrieved from various sources like journals, electronic databases, conference websites and proceedings, homepages of researchers and organizations, etc. Nowadays, it is common for research institutions and researchers to maintain a record of their publications and provide the same on their respective websites. In this paper, we discuss the extraction of a collaboration network to study co-authorship collaborations in Computer