DOI 10.1007/s00450-009-0077-5 SPECIAL ISSUE PAPER CSRD (2009) 24: 211–224 Revenue maximization in web service provision Michele Mazzucco · Isi Mitrani · Jennie Palmer · Mike Fisher · Paul McKee Received: 21 July 2008 / Accepted: 6 February 2009 / Published online: 6 May 2009 Springer-Verlag 2009 Abstract An architecture of a hosting system is presented, where a number of servers are used to provide differ- ent types of web services to paying customers. There are charges for running jobs and penalties for failing to meet agreed Quality-of-Service requirements. The objective is to maximize the total average revenue per unit time. Dy- namic policies for making allocation and admission deci- sions are introduced and evaluated. The results of several experiments with a real implementation of the architecture are described. Keywords Web Service Hosting · Resource Allocation · Admission Control · Queuing Theory CR subject classification D.2.8 · D.4.7 · D.4.8 · G.3 · G.4 1 Introduction This work deals with the topic of web service hosting in a commercial environment. A service provider employs a cluster of servers in order to offer a number of different services to a community of users. The users pay for having their jobs run, but demand in turn a certain quality of ser- vice. Thus, with each service type is associated a Service Level Agreement (SLA), formalizing the obligations of the M. Mazzucco () · I. Mitrani · J. Palmer School of Computing Science, Newcastle University, Newcastle NE1 7RU, UK e-mail: Michele.Mazzucco@ncl.ac.uk e-mail: Isi.Mitrani@ncl.ac.uk M. Fisher · P. McKee BT Group, Adastral Park, Ipswich IP5 3RE, UK users and the provider. We consider two business models. In the first, a user agrees to pay a specified amount for each ac- cepted and completed job, while the provider agrees to pay a penalty whenever the response time (or waiting time) of a job exceeds a certain bound. In the second model, a SLA covers a group of jobs, re- ferred to as a “stream”. The user agrees to pay a specified amount for each accepted stream, and also to submit the jobs in it at a specified rate. The provider promises to run all jobs in the stream, and also to pay a penalty whenever the aver- age performance for the stream falls below a certain limit. In both cases, it is the provider’s responsibility to decide how to allocate the available resources, and when to accept jobs, or streams, in order to make the system as profitable as possible. Efficient dynamic policies that avoid both over- provisioning and overloading are clearly desirable. That, in general terms, is the problem that we wish to address. Our approach has two strands. The first involves quantita- tive modelling. Under suitable assumptions about the nature of user demand, it is possible to evaluate explicitly the ef- fect of particular server allocation and admission policies. Hence, we derive numerical algorithms that can be invoked at decision-making instants in order to implement both the allocation and the admission policies in a near-optimal man- ner. It is perhaps worth mentioning that the cost function adopted here could be generalized. For example, the SLA may include penalties for rejecting jobs, in addition to those associated with waiting or response time. Some such gener- alizations would be easy to implement, but are not included here. The second strand consists of designing and implement- ing a middleware platform for the deployment and use of web services. That system, called SPIRE (Service Provi- sioning Infrastructure for Revenue Enhancement), is a self- configurable and self-optimizing middleware platform [9] 13