A performance evaluation of distributed database architectures
Shiping Chen
1,
*
,†
, Alex Ng
2
and Paul Greenfield
3
1
CSIRO ICT Centre, Australia
2
University of Ballarat, Australia
3
CSIRO CMIS, Australia
SUMMARY
The globally integrated contemporary business environment has prompted new challenges to database archi-
tectures in order to enable organizations to improve database applications performance, scalability, reliability
and data privacy in adapting to the evolving nature of business. Although a number of distributed database
architectures are available for choice, there is a lack of an in-depth understanding of the performance charac-
teristics of these database architectures in a comparison way. In this paper, we report a performance study of
three typical (centralized, partitioned and replicated) database architectures. We used the TPC-C as the evalu-
ation benchmark to simulate a contemporary business environment, and a commercially available database
management system that supports the three architectures. We compared the performance of the partitioned
and replicated architectures against the centralized database, which results in some interesting observations
and practical experience. The findings and the practice presented in this paper provide useful information and
experience for the enterprise architects and database administrators in determining the appropriate database
architecture in moving from centralized to distributed environments. Copyright © 2012 John Wiley & Sons, Ltd.
Received 1 May 2011; Revised 26 December 2011; Accepted 17 March 2012
KEY WORDS: distributed database architecture; database partition; database replication; database benchmarking
1. INTRODUCTION
Codd [1] proposed the foundation for relational database management system (RDBMS) in 1970,
highlighting the use of relational view of data to provide a means of describing data with its natural
structure, which formed the solid basis for treating derivability, redundancy and consistency of
relations. Since then, many commercial and open source relational database systems were
developed, such as Oracle, DB2 and MySQL. Today, RDBMS has become an essential component
in most contemporary enterprise systems.
With the advancement of the Web and the business extending globally, enterprise systems are facing
challenges to support ever-increasing clients and business partners, globally data sharing and
messaging and 24Â 7 availability. From the viewpoint of database systems, these challenges imply
the back-end database systems: (a) must scale to support increasing loads; (b) must support different
distributed configurations to facilitate an enterprise’s global operations; and (c) must be robust and
reliable for various technical failures and natural disasters.
To meet these new challenges in the Internet-age business environments for performance, scalability and
resiliency, these commercial and open-source relational database systems have offered enhanced features to
meet the new demands by incorporating a variety of new features and configurable distributed database
architectures. For example, IBM’s DB2 pureScale provides cluster solution for non-mainframe platforms
up to 128 database servers with a fault-tolerant architecture and shared-disk storage to achieve automatic
*Correspondence to: Shiping Chen, PO Box 76, Epping, NSW 1710, Australia.
†
E-mail: shiping.chen@csiro.au
Copyright © 2012 John Wiley & Sons, Ltd.
CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE
Concurrency Computat.: Pract. Exper. 2013; 25:1524–1546
Published online 8 July 2012 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/cpe.2891