Communication Model for Decentralized Meta-Scheduler in Grid Environments Florin Pop * * Faculty of Automatics and Computer Science, University Politehnica of Bucharest, Romania Email: florinpop@cs.pub.ro Abstract— The paper presents the communication model for decentralized meta-scheduler in Grid Environments. The proposed model is a distributed, fault-tolerant, adaptive and efficient one. It is designed as an agents platform for Grid scheduling algorithms, which denote the decentralized architecture of this model. The platform contains two type of agents: one for resource management (Broker), and the other manages the users tasks requests (Agents). The paper describe the communication protocol between agents and the proposed structure for agents. It is presented the description of the scheduling algorithm in a logical flow of activities. The scheduler uses cluster schedulers like Condor or PBS, which denote the meta-scheduler approach. The agents platform and the scheduling algorithm are tested in a cluster mode. The results highlight very good communication time and according with multiple users requests. Keywords— Grid Scheduling, Decentralized Meta Sched- uler, Fault Tolerant Systems, Communication Protocol. I. I NTRODUCTION Grid computing becomes more and more interesting as a resource sharing system. Management of these resources according with requirements of users from different vir- tual organizations is an important goal for Grid systems. Scheduling in distributed systems has been significantly evolving with the increase in popularity of Grid sys- tems [1] and virtual organizations [2]. The scheduling algorithms for large scale distributed systems like the Grid are the subjects of recent research in the domain. Depending on their applicability domain, we have three types of scheduling systems: cluster-level, inter-cluster and hierarchical schedulers. The role of the Cluster-level schedulers is to determine the optimal resources for the execution of a job submitted within the cluster. An advantage of this type of scheduler is that it resides on a single cluster node and can access the information related to the other nodes in the cluster. Examples: Condor [4], PBS [5], LSF etc. Inter-cluster schedulers (decentralized) have compo- nents distributed among different clusters. The compo- nents cooperate to determine the cluster to which the application should be assigned in order to satisfy certain criteria established in the scheduling process. These kinds of schedulers are also called meta-schedulers. Hierarchical architecture contains the mixture of cen- tralized and decentralized components. The components cooperate to determine the optimal resources or cluster for the execution of a job submitted within the Grid. Since the version of Globus 4.0 [3], the web-services based architectures are, together with the Grid services, significant factors in the mechanism of resource and application management, in the quality of service control and also in the control of resource accessibility - which is one of the most important aspects. The design of scheduling algorithms for a heterogeneous computing system interconnected with an arbitrary communication network (such as Grids) is one of the actual concerns in distributed systems research. These algorithms have the main purpose to generate a planning solution having as an input different sets of tasks and taking into consideration the potentially non-uniform computation and communi- cation costs, that appear in heterogeneous systems [6]. This paper presents an implementation of communi- cation model for Grid scheduling as a middleware grid infrastructure that features a decentralized scheduler. The scheduling algorithm uses genetic and adaptive algorithms for cost estimation and resource selection. The created scheduling system allows tasks evaluation and submis- sion using a network of agents [8]. The experimental results are obtained from simulated experiments. For real environments it is used existing monitoring system (MonALISA [7] as a cluster monitoring, and PBS [5], Condor [4] as a job execution). The paper is structured as follows: in section 2 is presented the related work in the field. In the third section contains a description of the created scheduling model: Agent and Broker structure, communication model and protocol, and a description of proposed scheduling algo- rithm. The experimental results obtained by this system are presented in the forth section. The section 5 gives the conclusions and emphasize directions for future work. II. RELATED WORK Jennifer M. Schopf describes in [9] the three main phases that are important for Grid Scheduling: resource discovery, which generates a list of potential resources, information gathering about those resources and selection of a best set, and job execution, which includes file staging and cleanup. These phases, and the steps that make them up, are: Resource Discovery. The actions that are made in this phase are: Authorization filtering, Application requirement definition, Filtering to meet the minimal job requirements. International Conference on Complex, Intelligent and Software Intensive Systems 0-7695-3109-1/08 $25.00 © 2008 IEEE DOI 10.1109/CISIS.2008.131 315 International Conference on Complex, Intelligent and Software Intensive Systems 0-7695-3109-1/08 $25.00 © 2008 IEEE DOI 10.1109/CISIS.2008.131 315 International Conference on Complex, Intelligent and Software Intensive Systems 0-7695-3109-1/08 $25.00 © 2008 IEEE DOI 10.1109/CISIS.2008.131 315