A Model to Enhance the Performance of Distributed File System for Cloud Computing Pradheep Manisekaran* and Ashwin Dhivakar M R** *Assistant professor, Department of Computer Science and Engineering, NIMS University, Jaipur, India Email: pradheep45@hotmail.com **Research Scholar Jaipur National University Jaipur Email: ashdhiv@gmail.com Abstract: Cloud computing is a new era of computer technology. Clouds have no borders and the data can be physically located anywhere in any data center across the network geographically distributed. Large scale distributed systems such as cloud computing applications are getting very general. These applications come with increasing challenges on how to transfer and where to store and compute data. The most current distributed file systems to deal with these challenges are the Hadoop file system (HDFS) and Google file system (GFS). But HDFS has some issues. The most factors are that it depends on one name node to handle the majority operations of every data block in the file system. As a result, it may be a bottleneck resource and one purpose of failure. The second potential problem with HDFS is that it depends on TCP to transfer data. Usually, TCP takes several rounds before it will send at the complete capability of the links in the cloud. This results in low link utilization and longer downloads times. In such file systems, nodes simultaneously serve computing and storage functions; a file is divided into a number of chunks allocated to distinct nodes so MapReduce tasks may be performed in parallel over the nodes. However, in a cloud computing, the crash is the commonplace, and nodes could also be upgraded, replaced, and added to the system. Files can even be dynamically created, deleted, and appended. This results in load imbalance in a distributed file system; that's, the file chunks aren't distributed as uniformly as potential among the nodes. Growing distributed file systems in production systems powerfully depend upon a central node for chunk reallocation. This confidence is clearly inadequate in a large-scale, failure-prone setting as a result of the central load balancer is put out vital workload that's linearly scaled with the system size therefore, it become the performance bottleneck a single purpose of failure. Suppose we tend to save the files in cloud information and a few third party accesses those files and adds some extraneous information which will damage our system. thus to boost the performance and security of cloud computing in this thesis we use a new approach called load balancing with round robin algorithm. Keywords: Cloud computing, File system, Distributed System, Storage System, Load balancing. Introduction Cloud computing is a compelling technology .in cloud users can dynamically store and access their resources without sophisticated deployment and management of resources by means of internet. Cloud computing is emerging as a new paradigm of large-scale distributed computing.it has moved computing and users data away from desktop, portable devices into large data centers.it has the capability to utilize the power of internet and wide area network to access the resources that are available remotely (e.g. software, storage, data, network).cloud computing has two broad categories such as cloud and cloud technologies. The term “cloud” refers to a collection of infrastructure services such as software as a service, infrastructure as a service, and platform as a service. The term “cloud Technologies” refers to various cloud runtimes such as MapReduce framework [1], Hadoop Distributed File System (HDFS), Google File System (GFS), etc. Cloud computing involves distributed technologies to satisfy a number of users and applications by providing functionalities like resource sharing, software, hardware, information through internet.in order to reduce the capital and operational cost, and to increase the performance in terms of response time and data processing time, maintain the system stability. Day by day the number of users, amount of data, structure of the network is increasing rapidly so that there are lot of technical challenges involves in this process such as virtual machine migration, data transfer, bottleneck performance, unpredictability, server consolidation, fault tolerance, scalable storage, high availability and major issue is the load balancing. Dealing with these challenges of large scale distributed data computer and storage intensive applications such as search engines, cloud storage applications, and social networks require robust scalable efficient algorithms and protocols. The google File System (GFS) which is used by google and Hadoop Distributed File System (HDFS) is a most common algorithm deployed in Facebook and yahoo today. Distributed file system are key building blocks for cloud computing application. Based on the MapReduce framework in such file systems nodes simultaneously serve computing and storage functions; a file is partitioned into a number of chunks allocated to distinct nodes so that MapReduce task can be performed