HTTPOS: Sealing Information Leaks with Browser-side Obfuscation of Encrypted Flows Xiapu Luo §* , Peng Zhou § , Edmond W. W. Chan § , Wenke Lee † , Rocky K. C. Chang § , Roberto Perdisci ‡ The Hong Kong Polytechnic University § , Georgia Institute of Technology † , University of Georgia ‡ {csxluo,cspzhouroc,cswwchan,csrchang}@comp.polyu.edu.hk, wenke@cc.gatech.edu, perdisci@cs.uga.edu Abstract Leakage of private information from web applications— even when the traffic is encrypted—is a major security threat to many applications that use HTTP for data deliv- ery. This paper considers the problem of inferring from en- crypted HTTP traffic the web sites or web pages visited by a user. Existing browser-side approaches to this problem cannot defend against more advanced attacks, and server- side approaches usually require modifications to web enti- ties, such as browsers, servers, or web objects. In this paper, we propose a novel browser-side system, namely HTTPOS, to prevent information leaks and offer much better scalabil- ity and flexibility. HTTPOS provides a comprehensive and configurable suite of traffic transformation techniques for a browser to defeat traffic analysis without requiring any server-side modifications. Extensive evaluation of HTTPOS on live web traffic shows that it can successfully prevent the state-of-the-art attacks from inferring private information from encrypted HTTP flows. 1 Introduction Leakage of private information from web applications is a major security threat to many applications that use HTTP for data delivery. Cloud computing and other similar service-oriented platforms will only exacerbate this prob- lem, because these services are usually delivered through web browsers. Moreover, it is well known that data encryp- tion alone is insufficient for preventing information leaks. For instance, traffic-analysis attacks can identify the web sites visited by a user from encrypted traffic [7, 22, 23, 26] and anonymized NetFlows records [14]. Chen et al. have further showed that sensitive personal information, such as medical records and financial data, could also be inferred through traffic analysis [13]. Besides, a user’s browser could be fingerprinted [39], and her browsing patterns could * Most of the work by this author was performed while at Georgia Tech. be profiled from traffic features [29]. A common approach to preventing leaks is to obfuscate the encrypted traffic by changing the statistical features of the traffic, such as the packet size and packet timing information [13, 23, 35, 38]. Existing methods for defending against information leaks, however, suffer from quite a few problems. A major problem is that, as server-side solutions, they require modi- fications of web entities, such as browsers, servers, and even web objects [13, 38]. Modifying the web entities is not fea- sible in many circumstances and cannot easily satisfy differ- ent applications’ requirements on information leak preven- tion. A second fundamental problem with these methods is that they are still vulnerable to some advanced traffic- analysis attacks. For example, although Sun et al. [35] pro- posed twelve approaches to defeat their traffic-analysis at- tack based on web object size, new attacks based on the tuple of packet size and direction [26] could still identify the web sites visited by a user. Finally, the efficacy of these methods has not been validated thoroughly based on actual implementations and live HTTP traffic. An exception is the work from Chen et al. [13] that is implemented as an IIS extension and a Firefox add-on. In this paper we explore a browser-side approach to pre- vent information leaks from encrypted web traffic. Com- pared with the server-side approach, the browser-side ap- proach has the scalability advantage, because only the traf- fic between the browser and the visited servers needs to be obfuscated. Moreover, it is possible for users to choose which encrypted flows to be obfuscated in order to con- serve resources and to reduce impacts on performance, but this flexibility advantage is very difficult to obtain from a server-side approach. However, designing a browser-side method is very challenging, because the server’s behavior cannot be directly modified to evade traffic analysis. That is, we cannot apply the previously proposed methods that assume the capability of modifying the server’s behavior to the browser-side approach. We show in this paper that it is possible to devise a browser-side method to defeat traffic analysis by presenting HTTPOS (which stands for HTTP or HTTPS with Obfus-