A Byzantine Fault Tolerance Model for a Multi-Cloud Computing Mohammed A. AlZain, Ben Soh and Eric Pardede Department of Computer Science and Computer Engineering, La Trobe University, Bundoora 3086, Australia. Email: [maalzain@students., b.soh@, e.pardede@]latrobe.edu.au Abstract—Data security has become an important requirement for clients when dealing with clouds that may fail due to faults in the software or hardware, or attacks from malicious insiders. Hence, building a highly dependable and reliable cloud system has become a critical research problem. This paper presents BFT-MCDB (Byzantine Fault Tolerance Multi- Clouds Database), a practical model for building a system with Byzantine fault tolerance in a multi-cloud environment. The model relies on a novel approach that combines Byzantine Agreement protocols and Shamir’s secret sharing approach to detect Byzantine failure in a multi-cloud computing environment as well as ensuring the security of the stored data within the cloud. Using qualitative analysis, we show that adopting the Byzantine Agreement protocols in the proposed BFT-MCDB model increases system reliability and enables gains in regard to the three security dimensions (data integrity, data confidentiality, and service availability). We also carry out experiments to determine the overheads of using the Agreement protocols. Keywords—Multi-Cloud Computing, Byzantine fault tolerance, Data Security, Data Replication, Byzantine Agreement protocols. I. INTRODUCTION The idea of multi-clouds is different to that of a federated cloud, as multi-clouds are centrally controlled by an administrative domain which controls other clouds in the same domain [9],[7]. In previous research [4],[6], we proposed the MCDB model which ensures the security and privacy of data in a multi-cloud computing environment. Consequently, we enhance the security in our model through the improvement of service dependability in the MCDB model by using triple modular redundancy (TMR) techniques [15]. The benefit of this improvement is that MCDB [5] is able to catch the active or non-Byzantine fault at the time of execution. But, how does one catch a latent or Byzantine fault in multi-clouds? Byzantine fault tolerance (BFT) has received growing attention from the academic research community but not many systems use it in practice. While a great deal of recent research has focused on comparing the standard practical Byzantine fault tolerance protocol (PBFT) [11] and improving its performance with the development of Zyzzyva [16] and Aardvark [12], very few studies on the BFT in a multi-cloud computing environment have addressed the detection of Byzantine failure to ensure the security of stored data within the cloud. In reality, the original definition of Byzantine faults [19] does not include security dimensions (data integrity, data confidentiality and service availability) [3], whereas a Byzantine cloud cooperates with malicious insiders to increase data intrusion. Data security is a major issue whenever users rely on third-party services because of the possibility of Byzantine failure in the cloud. Data security is an important requirement for clients when dealing with clouds that may fail due to faults in the software or hardware, or attacks from malicious insiders. Therefore, building a highly dependable and reliable cloud system has become a critical research problem. To address these issues, this work improves the existing MCDB model [5] in order to build a Byzantine fault tolerance multi-cloud database model which can detect Byzantine faults before being activated or before causing any negative impact on the system. It is difficult to detect Byzantine faults because it has no output. The previous MCDB model was able to detect non-latent faults or non- Byzantine faults whereas the current BFT-MCDB is able to detect latent faults or Byzantine faults. We based our model on the state machine replication approach [21]. Viewing our model in terms of state machines helps in understanding how our model replicates the data in the multi-clouds and centrally controls these clouds by the cloud manager. A general method for implementing a fault-tolerant system is the state machine approach. A distributed system should ensure the replication of servers which fail to tolerate faults in the state machine environment [21] which is similar to our BFT-MCDB model procedures. BFT-MCDB can guarantee the robustness of systems by building a group of one cloud manager connected to 2f+1clouds when up to f replica are faulty at run-time. The contribution of this work is as follows: This work identifies the Byzantine fault tolerance problem in the multi- cloud computing environment and proposes a Byzantine fault tolerance model, named BFT-MCDB, to ensure the robustness of the multi-cloud environment. The model presented in this paper relies on a novel approach that combines the Byzantine Agreement protocols [19],[16] and Shamir’s secret sharing approach [15] to detect Byzantine failure in the multi-cloud computing environment as well as to ensure the security of stored data within the cloud. BFT- MCDB is based on a state machine approach which has a replication mechanism [21]. Viewing our model in terms of state machines helps in understanding how our model M.A. AlZain is sponsored by Taif University in the Kingdom of Saudi Arabia. 2013 IEEE 16th International Conference on Computational Science and Engineering 978-0-7695-5096-1/13 $31.00 © 2013 IEEE DOI 10.1109/CSE.2013.30 130 2013 IEEE 16th International Conference on Computational Science and Engineering 978-0-7695-5096-1/13 $31.00 © 2013 IEEE DOI 10.1109/CSE.2013.30 130