A Measurement Study of the Wuala On-line Storage Service Thomas Mager, Ernst Biersack, and Pietro Michiardi EURECOM Sophia Antipolis, France {mager,erbi,michiardi}@eurecom.fr Abstract—Wuala is a popular online backup and file sharing system that has been successfully operated for several years. Very little is known about the design and implementation of Wuala. We capture the network traffic exchanged between the machines participating in Wuala to reverse engineer the design and operation of Wuala. When Wuala was launched, it used a clever combination of centralized storage in data centers for long-term backup with peer-assisted file caching of frequently downloaded files. Large files are broken up into transmission blocks and additional transmission blocks are generated using a classical redundancy coding scheme. Multiple transmission blocks are sent in parallel to different machines and reliability is assured via a simple Automatic Repeat Request protocol on top of UDP. Recently, however, Wuala has adopted a pure client/server based architecture. Our findings and the underlying reasons are substantiated by an interview with a co-founder of Wuala. The main reasons are lower resource usage on the client side, which is important in the case of mobile terminals, a much simpler software architecture, and a drastic reduction in the cost of data transfers originating at the data center. I. I NTRODUCTION Storing personal data online using cloud-based storage ser- vices such as Amazon S3 [1], Google Docs [2], and DropBox [3] has become business and an effective solution for the needs of a growing number of users. As a complement to cloud- based solutions, online storage systems based on a peer-to-peer (P2P) design have been investigated in academia to cope with problems related to cost for long-term storage [4], security [5] and data lock-in [6]. In this work, we focus on a popular online backup and file sharing system, called Wuala 1 , that has been successfully operated for several years. Wuala is particularly interesting because when it was launched in 2008 it had adopted a hybrid design, making use of servers in a data center as well as leveraging resources of the participating peers. However, not much is known about the design of Wuala. In this work we present some of the salient features of Wuala and discuss its recent evolution. In summary, the goal of this paper is to • Characterize the infrastructure of Wuala such as the number of servers involved in the operation of Wuala and estimate the number of peers • Determine the data placement adopted by Wuala to understand the mechanisms used to decide where data is stored 1 See http://www.wuala.com/. • Understand if and how coding techniques are used to assure the availability and durability of the data in case of node failures • Determine the transport protocol used to move data between peers and servers • Describe the evolution Wuala has undergone between 2010 and 2012. Our findings indicate that Wuala uses a simple, yet clever system design. Data availability – i.e., making sure that files are accessible at any time – and durability – that is ensuring that files are never lost – are achieved by relying on servers located in a data center, instead of using peers. This simpli- fies the software architecture and avoids costly maintenance operations to cope with peer churn. In Wuala, user data is first interleaved and encoded before it is transmitted via a UDP-based transport protocol, which allows for parallel data transfers to improve transfer performance. Our results also reveal that in Wuala peers are only used as distributed caches that off-load the servers when delivering frequently-accessed data. Recent measurements show that Wuala has evolved towards a pure client-server model: peer resources are no longer used and data storage is offered as a cloud-based service addressing mainly business customers. The remainder of this paper is organized as follows. In Section II we explain our experimental methodology used. In Section III we overview the Wuala architecture, followed by a discussion on how data is managed in Section IV. We describe the anatomy of uploading and downloading files, respectively in Section V and VI. In Section VII we describe the reliable transport protocol of Wuala. We finally give some details on recent changes to Wuala and discuss why peer-to-peer based systems seem to be getting out of fashion before we conclude the paper. II. GENERAL EXPERIMENTAL SETUP We perform a series of experiments and measurements to elucidate the design and operation of Wuala. This is necessary since the Java bytecode is obfuscated [7], which prevents the analysis of Wuala by studying its source code. For our experiments, we run several Wuala clients in our Lab and capture all the network traffic between our Wuala clients and the rest of the Internet. We use two tools to analyze the traffic: