This can lower the direct comparability of data, at least when they are looked at directly without explanatory background information. This study seeks to understand whether comparing the perfor- mance of urban bus operators through a benchmarking exercise is both useful and justifiable. Benchmarking could be deemed useful if the performance comparisons exhibit sufficient significant vari- ability in performance between operators such that operators can learn lessons from one another. The exercise could be viewed as jus- tifiable if different external conditions do not affect performance to the extent that the variability in performance is mainly a function of these conditions, rather than true differences in performance that are within the control of management. PUBLIC TRANSPORT BENCHMARKING RESEARCH Benchmarking as a tool for performance comparison and best- practice finding has been described and defined in many articles and papers. Fong et al. state that benchmarking draws wide attention from various disciplines and that definitions of benchmarking differ according to the process or practice that was benchmarked and the actual methodology itself (1). Fong et al. therefore provided three working definitions for benchmarking. Their working definition most closely related to the IBBG process is by Lema and Price: “A sys- tematic and continuous measurement process; a process of continu- ously measuring and comparing an organisation’s business process against business leaders anywhere else in the world to gain infor- mation which will help the organization to take action to improve its performance” (2). This paper does not provide an overview of bench- marking definitions nor aims to add to this list. However, an adapted working definition is given to better represent the IBBG benchmark- ing process, as follows: Benchmarking is a systematic process of continuously measuring, comparing, and understanding organiza- tions’ performance and change in performance of a diversity of key business processes against comparable peers anywhere else in the world to gain information that will help the participating organizations to take action to improve their performance. Benchmarking is applicable to many sectors, including the pub- lic transport sector. An overview of public transport benchmarking initiatives has been provided in a variety of reports and papers (3–5). Other papers have described lessons learned from specific public transport benchmarking initiatives. Lessons from Railbench are described by Vaglio (6). Gudmundsson et al. describe lessons from Benchmarking European Sustainable Transport and Benchmarking of Benchmarking (7 ). Lessons learned from the CoMET and Nova metro benchmarking groups are described by Anderson (8, 9). How- ever, the literature review revealed few examples of benchmarking activity within the public bus service industry. Mulley describes the Variability in Comparable Performance of Urban Bus Operations Mark Trompet, Richard J. Anderson, and Daniel J. Graham 177 Whether comparing the performance of urban bus operators through a benchmarking exercise is useful and justifiable is examined. Bench- marking can be deemed useful if performance comparisons exhibit suf- ficient significant variability in performance between operators such that lessons can be learned from one another. The exercise can be viewed as justifiable if different external conditions do not affect perfor- mance to the extent that the variability of the results can be judged as incomparable. The data used for the study were collected by the Inter- national Bus Benchmarking Group, facilitated by Imperial College London, and related to 10 medium to large bus operators from nine countries for 2001 to 2007. After data stratification and normalization, especially for differences in vehicle size, demand profile, and commer- cial speed, the results suggest that comparing performance of urban bus operations through benchmarking is both useful and justifiable as long as there is a sufficient number of operators in the comparison that exhibit similar operating characteristics and urban environments. In 2003, two large urban bus operators formed a group to compare performance and share best practices with peers in other large cities. For one of these operators, there were no sufficiently comparable bus systems within its country, and comparison had to be found at an international level. Other interested organizations were approached, and in August 2004 the International Bus Benchmarking Group (IBBG) was founded. As IBBG approaches its fifth annual phase, it is a good time to reflect on the benchmarking work of the group and to communicate to a wider audience whether quantitative benchmarking has been a useful and valid tool. In many countries, operators in large cities often have no domes- tic peers with which to compare their performance. Benchmarking has the advantage that similar operators, comparable in size and operating characteristics, can nonetheless be found elsewhere in the world. Furthermore, lessons or otherwise interesting practices can be shared with organizations from other cultures and backgrounds. This can open the minds of managers to ideas and practices that otherwise would not have been apparent to them. However, these different cultures and backgrounds can lead to exogenous factors affecting the performance of operators beyond the control of man- agement. Even in countries where peers can be found domestically, differences exist in local conditions that can affect performance. Railway and Transport Strategy Centre, Centre for Transport Studies, Depart- ment of Civil and Environmental Engineering, Imperial College London, Skempton Building, SW7 2AZ London, United Kingdom. Corresponding author: M. Trompet, m.trompet@imperial.ac.uk. Transportation Research Record: Journal of the Transportation Research Board, No. 2111, Transportation Research Board of the National Academies, Washington, D.C., 2009, pp. 177–184. DOI: 10.3141/2111-20