Describing the orthology signal in a PPI network at a functional, complex level Pavol Jancura 1 , Eleftheria Mavridou 2 , Beatriz Pontes 3 , and Elena Marchiori 1 1 Institute for Computing and Information Sciences, Radboud University Nijmegen, Postbus 9010, 6500 GL Nijmegen, The Netherlands {jancura,elenam}@cs.ru.nl 2 Department of Medical Microbiology, Radboud University Medical Center, Postbus 9101, 6500 HB Nijmegen, The Netherlands 3 Department of Computer Science, University of Seville, Avda. Reina Mercedes s/n 41012 Seville, Spain Abstract. In recent work, stable evolutionary signal induced by orthologous proteins has been observed in a Yeast protein-protein interaction (PPI) network. This finding suggests more connected subgraphs of a PPI network to be potential mediators of evolutionary in- formation. Because protein complexes are also likely to be present in such subgraphs, it is interesting to characterize the bias of the orthology signal on the detection of putative protein complexes. To this aim, we propose a novel methodology for quantifying the functionality of the orthology signal in a PPI network at a protein complex level. The methodology performs a differential analysis between the functions of those complexes detected by clustering a PPI network using only proteins with orthologs in another given species, and the functions of complexes detected using the entire network or sub-networks generated by random sampling of proteins. We applied the proposed methodology to a Yeast PPI network using orthology information from a number of different organisms. The results indicated that the proposed method is capable to isolate functional categories that can be clearly attributed to the pres- ence of an evolutionary (orthology) signal and quantify their distribution at a fine-grained protein level. 1 Introduction In general, two proteins are orthologous if they originated from a common ancestor, having been separated in evolutionary time only by a speciation event. Orthologous proteins have high amino acid sequence similarity and usually retain the same or very similar function, which allows one to infer biological information between the proteins. Obviously, orthology as such is very important in studying evolution. Therefore, the problem of establishing proper orthology relations has been under the wide investigation in comparative genomics (see for instance [1]) and many databases and public resources of orthologs have been made available, such as Inparanoid [2] and OrthoMCL- DB[3]. Recent studies used this form of evolutionary information to analyse protein modules and PPI networks, for instance [4–12]. In particular, in a study by Wutchy et al. [6] stable evolutionary signal was found to be present in a Yeast PPI network as examined by its pairwise orthologs with respect to various different species. They observed that a high local clustering around protein-protein interactions correlates with evolutionary conservation of the participating proteins. This means that highly connected proteins and protein pairs embedded in a well clustered neighbourhood tend to be evolutionary conserved and therefore retain their evolutionary signal. These findings suggest also that more connected areas of a PPI network are potential mediators of evolutionary information. Because more connected regions of PPI networks contain protein modules or complexes, in this paper we focus on the explicit use of orthology to see whether there are functional complexes that can be clearly attributed to this evolutionary signal. To this aim, we try to characterize those functions of complexes predicted by clustering the subgraph of a PPI network induced by all proteins with orthologs in another given species, but not predicted (or predicted for a smaller