Future Generation Computer Systems 25 (2009) 499–510 Contents lists available at ScienceDirect Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs Using historical accounting information to predict the resource usage of grid jobs Rosario M. Piro a,*,1 , Andrea Guarise b , Giuseppe Patania b , Albert Werbrouck b a Molecular Biotechnology Center (MBC), Department of Genetics, Biology and Biochemistry, University of Torino, Via Nizza 52, 10126 Torino, Italy b Istituto Nazionale di Fisica Nucleare (INFN) - Sezione di Torino, Via Pietro Giuria 1, 10125 Torino, Italy article info Article history: Received 23 October 2007 Received in revised form 6 November 2008 Accepted 8 November 2008 Available online 25 November 2008 Keywords: Resource usage prediction Grid accounting Workload analysis abstract Basing job scheduling decisions on estimated queue wait times may help in efficiently balancing the workload on the grid. Previous work on usage prediction has mainly described methods for the estimation of queue wait times on clusters and supercomputers, based on the prediction of the run times of single jobs in a queue. We evaluate the possibility to use the historical information provided by a grid accounting system to predict not only run times of grid jobs, but also other types of resource usage (or resource consumption), hence increasing the parameter space on which job scheduling decisions may be based. For this purpose we analyze three grid accounting datasets from a large-scale production environment and report interesting findings about their characteristics. © 2008 Elsevier B.V. All rights reserved. 1. Introduction Although recent years have witnessed major advances in the research on computational grids and grid middleware, the definition of appropriate job scheduling strategies for efficiently balancing the workload among the available resources, and thus optimizing the system’s overall throughput, is still an open issue. The prediction of job run times, or job execution times (also called ‘‘wall clock times’’ to distinguish them from CPU times), may be a basis for estimating job queue wait times [1–5] and help in improving the performance of local schedulers [3,5]. Moreover, estimating the queue wait times on different systems may also be used in distributed computing environments, such as a computational grid, to guide the submission of single jobs for remote execution in order to help balancing the incoming workload, or for the purpose of co-allocation of multiple resources. In our previous simulation work, we demonstrated that an economic approach to job scheduling in computational grids may aid in balancing the workload among the Computing Elements (CEs), if resource prices reflect the queue wait times on CEs and the Resource Brokers (RBs), or metaschedulers, use a price-sensitive scheduling strategy [6,7]. We assumed, however, job run times and queue wait times to be precisely known in advance, which is not true in production environments. This and similar strategies for workload balancing may significantly profit from an accurate resource usage prediction. * Corresponding author. E-mail address: rosario.piro@unito.it (R.M. Piro). 1 Previously at INFN Torino, now at MBC. In contrast to most of the related work on resource usage prediction, that has focused on the prediction of job run times based on historical information from local workload traces, in this paper we evaluate the prediction of several types of resource usage/consumption – namely CPU time, wall clock time, physical memory and virtual memory – and base our predictions on potentially grid-wide historical information provided by a grid accounting system (in our case the Distributed Grid Accounting System, DGAS [8,9]). Predicting not only job run times, but also other usage values, may be useful, for example, to predict a job’s cost in an economic grid market that considers not only processing power as a commodity. Shneidman et al., for example, have emphasized the importance of predicting resource consumption for a successful application of market-based approaches to solving resource allocation problems in distributed systems [10]. Furthermore, being able to accurately predict multiple usage values may serve as a basis for improved job scheduling strategies for Resource Brokers, since the efficiency of scheduling decisions depends not only on the quality, but also on the quantity of relevant information available about jobs and resources. The prediction of the grid jobs’ physical and virtual memory usage, for example, may be useful in cases where multiple concurrent jobs on a host share the same memory. Scheduling grid jobs to CEs that better fit their memory requirements may aid in improving the overall resource utilization, since the maximization of resource utilization should not only regard processing power (as it is often understood), but also other resource types. ‘‘Idle’’ memory, for example, may be considered as much a waste of valuable resources as idle CPUs. The remainder of the paper is organized as follows: Section 2 introduces the prediction based on historical information and 0167-739X/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.future.2008.11.003