Generalization Regions in Hamming Negative Selection Thomas Stibor 1 , Jonathan Timmis 2 , and Claudia Eckert 1 1 Darmstadt University of Technology Department of Computer Science Hochschulstr. 10, 64289 Darmstadt, Germany 2 University of York Department of Electronics and Department of Computer Science Heslington, York, United Kingdom Abstract Negative selection is an immune-inspired algorithm which is typically applied to anomaly detection problems. We present an empirical investigation of the generalization capability of the Hamming negative selection, when combined with the r-chunk affinity metric. Our investigations reveal that when using the r- chunk metric, the length r is a crucial parameter and is inextricably linked to the input data being analyzed. Moreover, we propose that input data with different characteristics, i.e. different positional biases, can result in an incorrect generaliza- tion effect. 1 Introduction Negative selection was one of the first immune inspired algorithms proposed, and is a commonly used technique in the field of artificial immune systems (AIS). Negative selection is typically applied to anomaly detection problems, which can be considered as a type of pattern classification problem, and is typically employed as a (network) intrusion detection technique. The goal of (supervised) pattern classification, is to find a functional map- ping between input data X to a class label Y so that Y = f (X ). The mapping function is the pattern classification algorithm which is trained (or learnt) with a given number of labeled data called training data. The aim is to find the mapping function, which gives the smallest possible error in the mapping, i.e. minimize the number of samples where Y is the wrong label ( this is espe- cially important for test data not used by the algorithm during the learning phase). In the simplest case there are only two different classes, with the task being to estimate a function f : R N →{0, 1}∋ Y , using training data pairs generated i.i.d. 1 according to an unknown probability distribution P (X, Y ) (X 1 ,Y 1 ),..., (X n ,Y n ) R N × Y, Y ∈{0, 1} such that f will correctly classify unseen samples (X, Y ). If the training data consists only of samples from one class, and the test data contains samples from two or more classes, the classification task is called anomaly detection. 1 independently drawn and identically distributed