A performance evaluation of distributed database architectures Shiping Chen 1, * , , Alex Ng 2 and Paul Greeneld 3 1 CSIRO ICT Centre, Australia 2 University of Ballarat, Australia 3 CSIRO CMIS, Australia SUMMARY The globally integrated contemporary business environment has prompted new challenges to database archi- tectures in order to enable organizations to improve database applications performance, scalability, reliability and data privacy in adapting to the evolving nature of business. Although a number of distributed database architectures are available for choice, there is a lack of an in-depth understanding of the performance charac- teristics of these database architectures in a comparison way. In this paper, we report a performance study of three typical (centralized, partitioned and replicated) database architectures. We used the TPC-C as the evalu- ation benchmark to simulate a contemporary business environment, and a commercially available database management system that supports the three architectures. We compared the performance of the partitioned and replicated architectures against the centralized database, which results in some interesting observations and practical experience. The ndings and the practice presented in this paper provide useful information and experience for the enterprise architects and database administrators in determining the appropriate database architecture in moving from centralized to distributed environments. Copyright © 2012 John Wiley & Sons, Ltd. Received 1 May 2011; Revised 26 December 2011; Accepted 17 March 2012 KEY WORDS: distributed database architecture; database partition; database replication; database benchmarking 1. INTRODUCTION Codd [1] proposed the foundation for relational database management system (RDBMS) in 1970, highlighting the use of relational view of data to provide a means of describing data with its natural structure, which formed the solid basis for treating derivability, redundancy and consistency of relations. Since then, many commercial and open source relational database systems were developed, such as Oracle, DB2 and MySQL. Today, RDBMS has become an essential component in most contemporary enterprise systems. With the advancement of the Web and the business extending globally, enterprise systems are facing challenges to support ever-increasing clients and business partners, globally data sharing and messaging and 24Â 7 availability. From the viewpoint of database systems, these challenges imply the back-end database systems: (a) must scale to support increasing loads; (b) must support different distributed congurations to facilitate an enterprises global operations; and (c) must be robust and reliable for various technical failures and natural disasters. To meet these new challenges in the Internet-age business environments for performance, scalability and resiliency, these commercial and open-source relational database systems have offered enhanced features to meet the new demands by incorporating a variety of new features and congurable distributed database architectures. For example, IBMs DB2 pureScale provides cluster solution for non-mainframe platforms up to 128 database servers with a fault-tolerant architecture and shared-disk storage to achieve automatic *Correspondence to: Shiping Chen, PO Box 76, Epping, NSW 1710, Australia. E-mail: shiping.chen@csiro.au Copyright © 2012 John Wiley & Sons, Ltd. CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2013; 25:15241546 Published online 8 July 2012 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/cpe.2891