Multiple classifier systems under attack Battista Biggio, Giorgio Fumera, and Fabio Roli Dept. of Electrical and Electronic Eng., Univ. of Cagliari Piazza d’Armi, 09123 Cagliari, Italy {battista.biggio,fumera,roli}@diee.unica.it WWW home page: http://prag.diee.unica.it Abstract. In adversarial classification tasks like spam filtering, intru- sion detection in computer networks and biometric authentication, a pat- tern recognition system must not only be accurate, but also robust to ma- nipulations of input samples made by an adversary to mislead the system itself. It has been recently argued that the robustness of a classifier could be improved by avoiding to overemphasize or underemphasize input fea- tures on the basis of training data, since at operation phase the feature importance may change due to modifications introduced by the adver- sary. In this paper we empirically investigate whether the well known bagging and random subspace methods allow to improve the robustness of linear base classifiers by producing more uniform weight values. To this aim we use a method for performance evaluation of a classifier un- der attack that we are currently developing, and carry out experiments on a spam filtering task with several linear base classifiers. 1 Introduction In adversarial classification tasks like spam filtering, intrusion detection in com- puter networks and biometrics [1–4], the goal of a pattern recognition system is to discriminate between two classes, which can be named “legitimate” and “malicious”, while an intelligent adversary manipulates samples to mislead the system itself. Adversarial classification problems are therefore non-stationary, which implies that a pattern recognition system should be designed by taking into account not only its accuracy (usually evaluated from a set of training sam- ples) but also its robustness, namely the capability of undergoing an accuracy degradation as low as possible when it is under attack. Very few works addressed so far the problem of devising practical methods to improve robustness. Recently, in [7] it was suggested that a more robust classifier could be obtained by avoiding to give features too much or too little emphasis during classifier training, and a similar approach was suggested in [6]. This allows to design robust classifiers against attacks in which the adversary exploits some knowledge on the classifi- cation function (e.g., the most discriminant features), as we discuss in the next section. It is well known that one of the main motivations for the use of multiple clas- sifier systems (MCSs) is the improvement of classification accuracy with respect