The Fuzzy Artificial Immune System: Motivations, Basic Concepts, and Application to Clustering and Web Profiling Olfa Nasaroui * , Fabio Gonzalez, and Dipankar Dasgupta Dept. of Electrical and Computer Engineering * and Intelligent Security Systems Research Lab, Division of Computer Science The University of Memphis e-mail: {onasraou,fgonzalz,ddasgupt}@memphis.edu Abstract - The human immune system can be seen as a complex network structure that is able to respond to an almost unlimited multitude of foreign invaders such as viruses and bacteria. Hence, this parallel and distributed adaptive system promises tremendous potential in many intelligent computing applications, including Web mining. Some of these immunity- based techniques involve the development and analysis of algorithms that can identify patterns in observed data in order to make predictions about unseen data. In this paper, we introduce several new enhancements to deal with some of the weaknesses of previous artificial immune system models. In particular, we address the uncertainty and fuzziness inherent in the matching process that takes place between antibodies and antigens. This problem is handled by introducing a fuzzy artificial immune system. A fuzzy artificial immune system mimicking the body’s adaptive learning and defense mechanism in the face of invading biological agents is used as a monitoring and learning system for a Web site in the face of all incoming Web requests. 1. INTRODUCTION Most living organisms exhibit extremely sophisticated learning and processing abilities that allow them to survive and proliferate generation after generation in their dynamic and competitive environments. For this reason, nature has always served as inspiration for several scientific and technological developments. One such natural system is the natural immune system that can be seen as a parallel and distributed adaptive system [2] that has tremendous potential in many intelligent computing applications. This is because the immune system exhibits the following points of strength: recognition, feature extraction, diversity, learning, memory, distributed detection, and self-regulation [2,3]. The immune system uses combinatorics to construct pattern detectors efficiently. Moreover, the detection/recognition process is highly distributed in nature. Based on these underlying mechanisms, an intelligent computational technique has been developed for pattern recognition and data analysis [11]. One of the data repositories, affecting every aspect of our life lately, is the World Wide Web. In addition to its ever- expanding size and lack of structure, the WWW has not been responsive to user preferences and interests. One way to deal with this problem is through personalization. Mining information from the user's interaction is another approach towards personalization. Perkowitz and Etzioni [16] proposed adapting Web pages based on a user's traversal pattern. In [1], associations and sequential patterns between web transactions are discovered. Most of the above efforts have relied on relatively simple techniques which can be inadequate for real user profile data since they are not resilient to the “noise” typically found in user traversal patterns. To handle possibly unknown noise contamination rates in Web data, Nasraoui et al. [13] have proposed mining the Web log data using a fuzzy relational clustering algorithm based on a robust estimator. In this work, they have also proposed the formal definition of a “robust” user profile and “robust” quantitative evaluation measures. To deal with the fuzzy nature of Web data and to automatically determine the number of clusters, profiles were extracted [14,15] using an unsupervised fuzzy relational clustering algorithm based on competitive agglomeration. The rest of the paper is organized as follows. In Section 2, we present an overview of the natural immune system. In Section 3, we review some of the current artificial immune system models. In Section 4, we present our fuzzy AINE model. In Section 5, we illustrate using the fuzzy AINE model for clustering. In Section 6, we describe our artificial immune system inspired approach to Web usage mining. In Section 7, we illustrate the performance of our approach in extracting session profiles from the access log file of a real Web site. Finally, in Section 8, we present our conclusions and future prospects. 2. THE NATURAL IMMUNE SYSTEM The natural immune system is a distributed novel-pattern detection system with several functional components positioned in strategic locations throughout the body. The main purpose of the immune system is to recognize all cells (or molecules) within the body and categorize those cells as self or non-self. The non-self cells are further categorized in order to stimulate an appropriate type of defensive mechanism. The immune system learns through evolution to distinguish between foreign antigens (e.g., bacteria, viruses, etc.) and the body's own cells or molecules. The lymphocyte is the main type of immune cell participating in the immune response that possesses the attributes of specificity, diversity, memory, and adaptivity. There are two subclasses of the lymphocyte -- T and B, each having its own function. In particular, B-Cells secrete antibodies that can bind to specific antigens. 3. ARTIFICIAL IMMUNE SYSTEM MODELS Artificial Immune Systems emerged in the 1990s as a new computational research area. Artificial Immune Systems link several emerging computational fields inspired by biological 0-7803-7280-8/02/$10.00 ©2002 IEEE