A Flexible and Adaptable Distributed File System S. E. N. Fernandes 1 , R. S. Lobato 1 , A. Manacero 1 , R. Spolon 2 , and M. A. Cavenaghi 2 1 Dept. Computer Science and Statistics, Univ. Estadual Paulista UNESP, São José do Rio Preto, São Paulo, Brazil 2 Dept. Computing, Univ. Estadual Paulista UNESP, Bauru, São Paulo, Brazil Abstract— This work describes the development of a ﬂexible and adaptable distributed ﬁle system model where the main concepts of distributed computing are intrinsically incor- porated. The ﬁle system incorporates characteristics such as transparency, scalability, fault-tolerance, cryptography, support for low-cost hardware, easy conﬁguration and ﬁle manipulation. Keywords: Distributed ﬁle systems, fault-tolerance, data storage. 1. Introduction The amount of stored data increases at an impressive rate, demanding more storage space and compatible processing speeds. Aiming to avoid complete data loss from failures or system overloads, it became usual to adopt the distributed ﬁles model [1] [2]. Therefore, a distributed ﬁle system (DFS) is a system where ﬁles are stored along distinct computers, linked through a communication network. Even though several DFS are capable of attending several characteristics, such as access/location transparency, performance, scalability, con- currency control, fault-tolerance and security, to attend them simultaneously is complex and difﬁcult to manage. Another important aspect to consider is that when one characteristic has its complexity increased, the remaining ones may be negatively affected. This explains why most of the DFS are developed aiming at fulﬁlling speciﬁc scenarios [2] [3] [4]. This paper proposes a novel model for a ﬂexible DFS, named FlexA (Flexible and Adaptable Distributed File Sys- tem), that can be adapted to the environment where it is being used. This ﬂexibility allows for DFS features to be adapted or even replaced by other including but not limited to the cryptography algorithm, level of replication, application programming interfaces, move some tasks from servers to clients and several conﬁgurations of software and hardware. In the following sections we start with a brief description of other DFS in use, focusing on ones that are the basis for the model presented here. Then we focus in the description of the proposed model, including its main characteristics and architecture. Results from the model evaluation are presented next, ﬁnishing with conclusions drawn from this evaluation and directions for future work. 2. Related work Among the several existing DFSs, this work focused on exploring the key features of some models of DFSs based on traditional designs and some newer systems, allowing to extract features for the development of a DFSs that has characteristics such as high performance, fault-tolerance and easiness of use. 2.1 Network File System Network File System (NFS) [2] [3] is a DFS based on remote procedure calls (RPC) providing a convenient medium to applications through a virtual layer (Virtual File System - VFS) that enables a transparent access to NFS components [5] [6] [7]. 2.2 Andrew File System Andrew File System (AFS) was designed aiming scala- bility to several users. In order to achieve this, aggressive cache policies are implemented on the client side, as well as efﬁcient techniques for consistency [2] [3]. 2.3 Google File System Google File System (GFS) operates on an architecture composed by parallel server clusters. GFS is distinguished by the serialization and ﬁle distribution directly to chunk servers that are the actual storage nodes, without the need for additional accesses to the main server, called "master" [1]. 2.4 Tahoe - The Least-Authority Filesystem Tahoe-LAFS is a DFS in the user space, where ﬁle sharing occurs through a sequence of characters manipulated by the Uniform Resource Locator (URL). This form of sharing allied to a decentralized security model, based on individual access control, allowed Tahoe-LAFS to manage directories and ﬁles as independent objects, which can be referenced by several processes using different names [8]. 2.5 Another systems Besides the systems just presented, other works seek to establish different priorities for their DFS models, such as SPRITE [9], CODA [10], IBM General Parallel File System [11], Ceph [12], XtreemFS [13], HDFS [14], Red Hat Global File System [15] and GlusterFS [16]. Among these DFSs, the