Structure and Dynamics of Enterprise 2.0 Communities Alex Nevidomsky IBM Ireland IBM Software Lab, Bld. 6, Mulhuddart, Dublin 15, Ireland 353-1-815-1908 alex_nevidomsky@ie.ibm.com Alexander Troussov IBM Ireland IBM Software Lab, Bld. 6, Mulhuddart, Dublin 15, Ireland 353-1-815-1906 atrousso@ie.ibm.com ABSTRACT In this paper, we present empirical analysis of user communities in a large-scale Enterprise 2.0 system. We studied communities formed in IBM in Communities application of IBM Lotus Connections suite. We found that sizes of communities exhibit an idiosyncratic behavior in the case of communities with less than ten members, while following power law distribution for bigger communities. Our results don’t show significant correlation between community size and numbers of artifacts related to user activity, such as discussion topics. On a monthly scale, more established communities generally don’t accumulate more results of user activity than newer communities. Keywords Enterprise 2.0, Social software, Community of practice, Data mining, Power law 1. INTRODUCTION We present an empirical analysis of a large-scale Enterprise 2.0 system, three years after its introduction within a large corporate environment. The study was carried out for development of more realistic generative models to test the performance of social software applications. The system (Communities application within IBM Lotus Connections suite) supports creation of formal communities of users, which can be public, moderated membership, or private with membership by invitation. Within communities there are discussion forums and collections of feeds and bookmarks. Statistics of the communities were analyzed from the moment of the installation of the system to the moment when the number of communities exceeded 9500 and the biggest community had more than 23000 members, were analyzed. Figure 1 shows the dynamics of growth for the usage of the system over the period. Figure 1. Number of newly created communities aggregated in 30 days intervals. The enterprise environment is predicated on the fact that the users are professionals, their identities are visible to other users, and community formation might be affected by organizational structures and business processes. Therefore we assume that the structure and dynamics of communities in the Enterprise 2.0 might be quite different from that of other social web systems. One of the target applications of the results of our analysis is development of generative models for simulation of Enterprise 2.0 communities, to test the performance of social software. Another application is tailoring of community-based recommender systems to the Enterprise 2.0 specifics. In the next section we provide basic statistics of number and size of the communities. In section 3 we provide statistics of user activity in time and theorize about possible explanations of observed phenomena. Finally, section 5 contains the conclusions and future directions. 2. STATISTICS OF THE COMMUNITIES We observed creation of communities of three types: public (users are free to join), restricted access (publicly visible, but membership must be approved), and private (not visible by non- members, members joining by invitation). The access policy is chosen at the community creation time, and there were no guidelines provided for the users to choose the policy, so the community initiators were using their own judgment. While technical possibility exists for access policy changes by the community creator at any time, we believe this option is rarely exercised. The system does not allow access from outside of the company. At the time of the study we found 3644 public, 2171 restricted access, and 3730 private communities, which is 38%, 23% and 39% respectively. As restricted access assumes public visibility of the contents, we can say that publicly visible communities outnumber private communities almost 2:1. The distribution of the size of communities is highly skewed: most of the communities have small size; rare but big outliers (for instance, the largest community has 23725 members) significantly affect the average (rendering the average as a misleading characteristic). As most of the other empirical distributions encountered in our study this distribution exhibits power-law behavior in the upper tail; in our case the average size of a community is 73, while the median is only 6; 90% of communities have less than 65 members, 75% - less than 20 members. Observed data demonstrates the characteristic power law tail for communities with more than ten members, while smaller communities exhibit an idiosyncratic behavior, as shown on Figure 2. The spike in the beginning of the curve may have a separate explanation. We observed around 1900 communities of one member (the first bar on the Figure 2); more than 1200 of them were private or restricted access, which seems impractical. While