Performance models for hierarchical grid architectures Paolo Cremonesi, Roberto Turrin Dip. di Elettronica e Informazione - Politecnico di Milano via Ponzio, 34/5, I-20133 Milano, Italy Abstract— Main characteristics of large–scale, geographically distributed grid systems are resource heterogeneity and net- work latency. In this paper, we use queuing network models to analyze data–parallel grid applications and we show the effects of resource heterogeneity, communications delays, network bandwidths and synchronization overheads on the application- level performance. The proposed models rely on the statistical pattern of computation, communication, and I/O operations in the parallel applications, as well as on measurable infrastructure characteristics. We finally show how the high variability in the execution and communication times must be considered when modeling applications on geographically distributed grid infrastructures. 1 I. I NTRODUCTION This paper presents queuing network models for the per- formance analysis of grid–based applications. The grid in- frastructure under consideration has a multi–level hierarchical architecture. Cluster grids and desktop grids are the lowest level and can be used to compose higher level campus grids and global grids. Each level in the hierarchy is composed of a number of meta–nodes and each meta–node is, in turn, com- posed of a number of lower level meta–nodes. Grid processing nodes can be heterogeneous and are assumed to be mono– programmed. Performance analysis is limited to Single Pro- gram Multiple Data (SPMD) applications. Communications can be synchronous or asynchronous. In case of synchronous communications, the synchronization overhead across the grid and the network overheads and latencies are considered. The models require the knowledge of some hardware characteristics of the target grid environment, which can be obtained by running benchmark programs available in the literature [1]. Furthermore, the knowledge of few character- istics of the application being executed, like the presence of synchronous/asynchronous communications and the amount of data transferred among processes, is needed. Extending the models presented in [2], we consider a grid application to be composed of a number of statistically identi- cal CPU bursts, internal communication bursts (i.e., communi- cation within a meta–node) and external communication bursts (i.e., communication among meta–nodes) that are executed in a cyclic fashion. The regular and cyclic structure of a grid application, confirmed by several studies [3], [4], [5], usually derives from the presence in the application of outer 1 This work was supported by the Italian Ministry of Education, Universities and Research (MIUR) in the framework of the FIRB-Perf project. iteration loops (e.g., the time evolution of some physical model, or the convergence of some approximate algorithm). For such applications it is possible to define a fork-join queuing network model that describes the performance of the coupled application/architecture system. The paper is organized as follows. Section II describes other works related to the performance modeling of grid applica- tions. Section III defines the class of grid systems to which our analysis can be applied. Section IV integrates architecture and application characteristics into a queuing network model. Section V analyzes the effects of node heterogeneity and service time distributions on the synchronization overhead among grid nodes. Section VII uses the models to predict the synchronization overhead as a function of nodes heterogeneity. Finally, Section VIII summarizes the results and the plans for future researches. II. RELATED WORKS Task graph models are frequently used to evaluate the performance of grid applications when the program control structures can be represented by means of “series–parallel” graphs. Lee and Weissman [6] present a performance model for the analysis of grid–enabled network services, i.e., high- performance applications available on-line. The model deals with stateless data-parallel services and provides a heuristic for adaptive scheduling of service requests. Giersch et al. [7], [8] evaluate the performance of grid applications when scheduling tasks sharing files on master-slave architectures. The analysis assume to have independent tasks whose execution depend upon a number of partially shared files. For more realistic models, queuing networks and petri nets can be used. Many approaches describe multi–programmed and multi–tasked parallel systems executing a sequence of pro- grams of similar task structure. The queuing network models are usually based on fork–join, open queuing networks, where a stream of series–parallel applications arrives in the system. Balsamo et al. [9] describe some approximate models for the analysis of heterogeneous parallel systems. Qin et al. [10] model the performance of cluster–based grid architectures where each cluster node is a shared–based multiprocessor. Bacigalupo and al. [11] describe a very simple queuing network model for the scalability analysis of e–commerce grid-based applications. Sun and Wu in [12] and Gong et al. [13] describe a performance prediction and task scheduling system based on a queuing network model. The model can