International Journal of Computer Applications (0975 8887) Volume 83 No 13, December 2013 37 Investigation of Peer Grouping Methods in Peer-to-Peer Computing Networks Jigyasu Dubey Department of Information Technology Shri Vaishnav Institute of Technology & Science, Indore, India Vrinda Tokekar, Ph.D Institute of Engineering & Technology Devi Ahilya Vishwavidyalaya, Indore, India ABSTRACT The Peer-to-Peer (P2P) technology provides support to build virtual computing system over the Internet which is dedicated for large scale computation problems. In such systems to achieve higher scalability and decentralization participating peers are classified into different groups. In P2P computing systems each peer group is responsible to carry out certain functionality of the system. Selection of different peers in these peer groups i.e. grouping criterion is one of the issues which is to be used to improve performance of the P2P computing systems. In this paper we investigate different grouping strategies possible in P2P computing networks. To compare them parameters like reliability, scalability, execution time etc. are taken into account. This study shows that if participating peers in a peer group are spread over different geographic locations then it make system more reliable. General Terms Peer-to-Peer Networks Keywords Peers, Groups, P2P, Grouping Mechanism, P2P computing 1. INTRODUCTION In the recent years Peer-to-Peer (P2P) technology gets the more attention from the research community as well as the industry. Decentralization is one of the main concepts of P2P networks that make it more attractive. The P2P system is a distributed system consists of a set of cooperative computers, known as peers, which share their resources like CPU cycles and memory with other peers in the system without any central authority. It is a virtual network built on top of a physical network. Initially P2P networks are used only for file sharing applications such as Napster [1] and BitTorrent[2]. Nowadays P2P networks are also used for developing large scale distributed computing applications. The SETI@home [3][4] and distributed.net [5] are the well-known examples of such kind of systems. P2P computing is a form of distributed computing that can utilizes the idle CPU cycles of PCs connected on the Internet. The P2P computing systems are dynamic in nature, i.e. peers may join or leave the system at its own decision occasionally. According to the degree of decentralization the P2P systems are classified into two categories: hybrid systems and purely decentralized system [6]. In hybrid P2P systems there is a central peer which maintains the information about all the member peers. This peer is also known as directory server or super peer. The SETI@home and distributed.net falls under this category. In pure P2P system there is no central point of control. All the peers have equal functionality in the system. Pure P2P systems are scalable in nature by avoiding need of centralized operations or servers. JNGI [7] is one of the P2P computing systems based on pure P2P architecture. JNGI divides peers into groups according to their functionalities. Division of peers into several peer groups limits the amount of communication between the peers hence improves the scalability. There are several issues which need to be addressed when building generalized pure P2P computing systems. One of them is organization of computational resources into groups [7]. Group is a widely used structure in distributed computing. Grouping refers to the partition of a fixed number of peers into multiple P2P communities. In pure P2P computing systems grouping decisions are important operational issue and need to be considered at the design time. In this study we investigate various grouping mechanism used in P2P computing networks. We also study the effect of grouping mechanism on performance of system on parameters like reliability, scalability, and execution time. 2. RELATED WORK Jerome Verbeke, Neelakanth Nadgir et al. in [7] presented a decentralized P2P computing framework for large-scale computation problems named as JNGI. In this framework the computational resources are divided into groups according to their functionality. They proposed three peer groups: the monitor group, the worker group, and the task dispatcher group. The design of framework limits communication to small peer groups that enables the framework to scale to a very large number of peers. Jerome Verbeke, et al. in [8] proposed to build new types of groups called similarity groups into the JNGI system. They define a similarity groups into the JNGI system. They define a similarity group as a peer group where all the peers have common characteristics like CPU speed or memory size. These groups can be used either for qualitative (structural) or quantitative (performance) purpose. Their result shows that the uses of quantitative similarity groups increase the performance of a computation while qualitative criterion increases the homogeneity of the computation but not its performance. However peer grouping based on geographic location criteria needs to be considered to improve the reliability. Virginia Lo, et al. in [9] proposed a system named cluster computing on the fly (CCOF) which harvest the CPU cycles from ordinary users (Desktop PCs). They also proposed a wave scheduler which exploits the large blocks of ideal time at night, to provide higher quality of service for deadline-driven jobs, using a geographic based overlay to organize hosts by time zone. In this wave scheduler they explore the possibilities to capture the CPU cycles from number of machines that lie completely idle at night. It provides a higher guarantee of ongoing available cycles hence it is useful for deadline driven tasks. The system provides the higher computation performance but due to using the peers from same night time zone which belongs the same geographic location the reliability of the system decreases. Bendikt Elser, et al. in [10] defines a concise set of requirements for general, application independent group management models in distributed systems. On the basis of