http://www.iaeme.com/IJCET/index.asp 46 editor@iaeme.com
International Journal of Computer Engineering & Technology (IJCET)
Volume 6, Issue 11, Nov 2015, pp. 46-53, Article ID: IJCET_06_11_005
Available online at
http://www.iaeme.com/IJCET/issues.asp?JType=IJCET&VType=6&IType=11
ISSN Print: 0976-6367 and ISSN Online: 0976–6375
© IAEME Publication
___________________________________________________________________________
AN EFFICIENT ALGORITHM IN FAULT
TOLERANCE FOR ELECTING
COORDINATOR IN DISTRIBUTED
SYSTEMS
Manoj Niranjan
Rustamji Institute of Technology, BSF Academy, Tekanpur
Mahesh Motwani
Rajiv Gandhi Proudyogiki Vishwavidyalaya, Bhopal
Cite this Article: Manoj Niranjan and Mahesh Motwani. An Efficient
Algorithm in Fault Tolerance for Electing Coordinator in Distributed Systems.
International Journal of Computer Engineering and Technology , 6(11), 2015,
pp. 46-53.
http://www.iaeme.com/IJCET/issues.asp?JType=IJCET&VType=6&IType=11
1. INTRODUCTION
A distributed system consists of various self-governing computers [15]. The self-
governing computers communicate to attain a common goal through a computer
network. The distributed computing systems, predominantly computing and
computer-based systems generally tolerate changes which are not desired, in their
internal structure or external environment in regular working which can be referred to
as faults[15]. A Fault may be an operational fault or design fault. Fault may occur
more than once or once. The techniques to tolerate the fault are used to make a system
fault tolerable. Checkpointing is a technique for fault tolerance which periodically
records the state of the system in stable storage. The Checkpointing technique
provides fault tolerance without requiring extra efforts from the programmer [1]. Any
state that is saved periodically is called the checkpoint of the process [2,3]. A global
state [4] [15] of a distributed system is a set of individual process states, on per
process [2] [15]. Checkpointing may be either independent or coordinated
checkpointing. In Independent checkpointing, each process takes checkpoint
independently without any synchronization between the processes [15] [5]. In
coordinated checkpointing, the processes coordinate their checkpointing actions in a
manner so that the set of local checkpoints taken is consistent [6,7,8,9].
The current work suggests a new coordinated checkpointing algorithm that
effectively selects a new coordinator process whenever the existing coordinator stops
working due to any failure. In this algorithm, the election of new coordinator takes