Scale invariant bipartite graph generative model Szymon Chojnacki Institute of Computer Science Polish Academy of Sciences sch@ipipan.waw.pl Mieczysław Kłopotek Institute of Computer Science Polish Academy of Sciences klopotek@ipipan.waw.pl ABSTRACT The purpose of this article is to present a preferential attachment model adjusted to generation of bipartite graphs. The original model is able to produce unipartite graphs with node degree distribution following power law relation. The motivation for extending classic model is the fact that multi-partite graph topologies are becoming more and more popular in social networks. Good example of such structure is a tripartite hypergraph of users, resources and tags called Folksonomy. This phenomenon raises questions of our ability to transform models that describe unipartite graphs to new settings. We present both empirical results concerning node degree distribution in real-life bipartite networks and a modified preferential attachment model that is able to reflect the properties observed in these networks. Categories and Subject Descriptors H.2.8 [Database Management]: Database Applications Data Mining General Terms Measurement, Theory Keywords graph generators, bipartite graphs, power law distribution, scale invariance 1. INTRODUCTION Bipartite or affiliation networks describe a situation in which we have two types of nodes and direct links only between nodes of different kinds. One type of nodes could be interpreted as actors and the second as events in which actors take part [20]. From this network we could induce two unipartite graphs containing only nodes of one type. However, it has been shown that despite some similarities, in general the structural properties of both representations may differ significantly [17]. More formally, a graph is an ordered pair ) , ( E V G comprising a set of vertices (or nodes) V and a set of edges (or links) E. A bipartite network (or bigraph) is a graph which vertices can be labeled by two types A and B. The difference with classic unipartite graph is the fact that V consists of two sets } : { B A B A V V V V V and edges exist only between nodes of different types B A V V E . Degree of a node is the number of its direct neighbors. Probability that a degree k v of node v of type A is exactly z can be calculated as: A A v V z E w v w V v z k P } ) , ( : { : ) ( In this article we focus on degree distributions obtained from our graph generative model. Empirical results show that the density function of a degree distribution in real life datasets often follows power law relation. It means that the frequency of an event decreases at a higher rate than its size increases. Such property is obtained from a polynomial relation: k x a x f ) ( (1) Where a and k are constants and k is called scaling exponent. The tail in power law distribution vanishes slower than exponentially, which is described as a heavy - tail. The shape of density function in scale invariant (or scale free) distribution does not change when we multiply variable x by c. One can verify that ) ( ) ( ) ( ) ( x f x f c cx a cx f k k . This property is visible when we take logarithm of both sides of Eq. 1.. In this way we obtain a linear relationship between variables a x k x f log ) log( )) ( log( in which k is responsible for the slope of a line drawn by the points. The scale invariant distribution is unavailable for purely random graphs [6]. The first model that was able to reflect this property by an atomic process was preferential attachment model [2]. We modify this model and show that our extension is also able to create power law degree distribution in bipartite networks. The rest of the paper is organized as follows: Section 2 surveys the related work. Section 3 gives results of our exploratory analysis. In Section 4 we describe in detail the preferential attachment model for bipartite random graphs generation, we conduct both formal mathematical analysis and show the results of experimental simulations. We conclude and discuss the implications of our findings in the last fifth section.