Negative Selection with High-dimensional Support for Keystroke Dynamics Paulo Henrique Pisani Universidade Federal do ABC (UFABC) ao Paulo, Brazil Email: paulo.pisani@ufabc.edu.br Ana Carolina Lorena Universidade Federal do ABC (UFABC) ao Paulo, Brazil Email: ana.lorena@ufabc.edu.br Abstract—Computing and communication systems have been expanding and bringing a number of advancements to our way of life. However, this technological evolution has also contributed to the rise of the identity theft, mainly due to the advent of the digital identity. An alternative to overcome this problem is by the analysis of the user behavior, known as be- havioral intrusion detection. Among the possible aspects to be analysed, this work focuses on the keystroke dynamics, which consists of recognizing users by their typing rhythm. This paper draws a comparison between some novelty detectors applied to keystroke dynamics: immune negative selection algorithms and auto-associative neural networks. Issues regarding the use of negative selection in high dimensional spaces are discussed and an alternative to deal with this problem is presented. Keywords-keystroke dynamics; artificial immune systems; negative selection; I. I NTRODUCTION It is clear that digital identities represent a key advance- ment in our society. However, the dissemination of these identities contributed for an increased data exposure and, consequently, for the identity theft [1]. Identity theft takes place when a person uses personal information of someone else as way to illegally pretend to be this person [2]. A promising alternative to curb this problem is by the use of behavioral intrusion detection systems [3], which detects anomalous behavior as potential intrusions. Among the possible user aspects to be analysed, keystroke dynamics is studied here. This work shows the application of immune negative selection algorithms (NSAs) for rec- ognizing users by their typing rhythm. These algorithms are novelty detectors, a class of classifiers that uses only samples from the positive class during the training phase. Afterwards, in the matching phase, these classifiers are able to differentiate between positive and negative data. As intruder samples are not always available, the approach of novelty detectors is more suitable for keystroke dynamics than binary classification, which requires positive and neg- ative samples in the training phase. Novelty detectors are sometimes referred to as one-class classifiers [4]. A key issue when applying NSA is the lack of support for high-dimensional spaces [5], preventing its widespread use in some real-world problems. This paper proposes an alternative to overcome this issue by using cosine similar- ity. An auto-associative multilayer perceptron (AAMLP), a well-known novelty detector, is used as baseline to evaluate negative selection performance. Throughout the paper, we present background information on keystroke dynamics and negative selection algorithms. In the end, we analyse the results obtained by the studied algo- rithms over a benchmark database. This work is organized as follows: in Section II, related work on keystroke dynamics is presented; Section III introduces negative selection algo- rithms and presents a NSA with high-dimensionality support for keystroke dynamics; Section IV details the experiments conducted here; Section V presents and discusses the results; and, finally, in Section VI, the conclusions are drawn. II. KEYSTROKE DYNAMICS Keystroke dynamics is considered to be a behavioral biometric technology and has several advantages over other technologies. Firstly, its implementation does not require any additional expenses with hardware, while other biometric technologies do (e.g. iris, fingerprint) [1]. Moreover, as the user does not need to perform actions specifically for the biometric system, the level of transparency of keystroke dynamics is enhanced, in contrast to a fingerprint or iris system, for instance, in which the user has to use a reader device. All these aspects contributes for an increased user acceptability when using this biometric technology [6]. The area of keystroke dynamics has been studied for more than 30 years and a number of works are available in the literature. One of the first works in the area is from 1980 [7]. Table I shows some of the researches carried out in keystroke dynamics. In this table, the number of users that took part in the experiments and the best performance reported is specified. This table is based on an adapted systematic review on keystroke dynamics we conducted [8]. There are two main forms of reporting results in keystroke dynamics: FAR and FRR: FAR (False Acceptance Rate) indicates the rate in which an intruder is misclassified as being a legitimate user and FRR (False Rejection Rate) in- dicates the rate in which a legitimate user is wrongly rejected by the system [6]. Usually, there is a trade off between FAR and FRR, so that when FAR increases, FRR tends to decrease and vice-versa. EER: EER (Equal Error Rate) represents the value when both FAR and FRR are equal [9]. 2012 Brazilian Symposium on Neural Networks 1522-4899/12 $26.00 © 2012 IEEE DOI 10.1109/SBRN.2012.15 19