K.Sabarigirivason et al, International Journal of Computer Science and Mobile Computing, Vol.3 Issue.2, February- 2014, pg. 80-85
© 2014, IJCSMC All Rights Reserved 80
Available Online at www.ijcsmc.com
International Journal of Computer Science and Mobile Computing
A Monthly Journal of Computer Science and Information Technology
ISSN 2320–088X
IJCSMC, Vol. 3, Issue. 2, February 2014, pg.80 – 85
RESEARCH ARTICLE
A NOVEL ON FAST PARALLEL FILE
TRANSFER USING REPLICATION
K.Sabarigirivason
1
M.E Computer science and Engineering,
Sri Eshwar college of Engineering,
Coimbatore, Tamilnadu, India.
sabari1151991@gmail.com
R. Giridharan
2
M.E Computer science and Engineering,
Sri Eshwar college of Engineering,
Coimbatore, Tamilnadu, India.
giridharanmecse@gmail.com
Abstract- Data replication is the most critical component of data-intensive grid computing environment. The need for data
replication arises in various areas of data analysis such as high-energy physics, bio-informatics, climate modeling and
astronomy. In addition to grid data environments, data replication is the key part of various data sharing applications such
as digital libraries, persistent archival environment and content distribution. Parallel file replication where a large file needs
to be simultaneously replicated to multiple sites is an integral part of data-intensive grid environment. Propose a tool that
creates multiple distribution trees by pipelining point-to-point transfer and optimizes the file replication time to multiple sites.
One of the key parts in data replication is the replica catalog that manages the mappings for files from the hierarchical
namespace to one or more physical file locations, thus providing an efficient and transparent file sharing on a Grid.
Managing and coordinating the data movement process is the crucial performance issue.
Keywords— Data replication, data intensive, grid computing, pipelining, replica
I. INTRODUCTION
Designing cost-efficient, secure network for transmitting data in parallel from one place to another it is a challenging problem
because sending data fastly and data security will lead which will prevent from identification of data loss and using of
GridFTP(File Transfer Protocol). Timely data replication is one of the most critical components of data-intensive grid
computing environment. The need for this component arises in various areas of data analysis such as high-energy physics, bio-
informatics, climate modeling and astronomy. For example, terabytes and petabytes of data produced by CERN have to be
shared and accessed by the high-energy physics community around the world. In addition to grid data environments, data
replication is the key part of various data-sharing applications such as digital libraries, persistent archival environment and
content distribution. In addition to these strategies, the network (transport) mechanism used in the actual movement of the data
plays an equally important role in the overall performance. The access time in data replication in general depends upon how the
network resources are utilized by the data transport mechanism.