A New User-Based Model for Credit Card Fraud Detection Based on Artificial Immune System Neda Soltani Faculty of Computer Engineering and Information Technology Amirkabir University of Technology Tehran, Iran neda.soltani@aut.ac.ir Mohammad Kazem Akbari Faculty of Computer Engineering and Information Technology Amirkabir University of Technology Tehran, Iran akbarif@aut.ac.ir Mortaza Sargolzaei Javan Faculty of Computer Engineering and Information Technology Amirkabir University of Technology Tehran, Iran msjavan@aut.ac.ir Abstract—In this paper we present a new model based on Artificial Immune System for credit card fraud detection. In this model, which is based on Artificial Immune Recognition System, user behavior is considered. The model puts together the two methodologies of fraud detection, namely tracking account behavior and general thresholding. The system generates normal memory cells using each user’s transaction records, yet fraud memory cells are generated based on all fraudulent records. To get more accurate results, we have performed analysis on training data in order to control the number of memory cells. During the test phase each user’s transaction is presented to his/her own normal memory cells, together with fraud memory cells. Keywords-Artifical Immune System; Credit card Fraud Detection; User profiling I. INTRODUCTION Credit cards are being used everywhere and have become a successful way of modern payment, while suffering from being misused. Using plastic cards in everyday payment activities, makes it easier for fraudsters to achieve novel ways of misusage. In this content we consider misusage as unauthorized account activity committed by means of the debit/credit facilities of a legitimate account [1]. Fraud detection is the act of recognizing such an activity and stopping it as soon as possible, namely before the transaction is accomplished. The related approaches are divided into two main subcategories. The absolute analysis that searches for thresholds between legal and fraudulent behavior, and the differential approach that tries to detect extreme changes in a user’s behavior [2]. First approach is a supervised method in which we need fraud records to create the model and decide on thresholds. However, the second method is based on user behavior, which might use user profiling, behavioral models, and related methods. In this approach the transactions with a salient difference from normal behavior are flagged as fraud. Confirming whether a transaction was done by a client or a fraudster by phoning all card holders is cost prohibitive if we check them in all transactions. Fraud prevention by automatic fraud detections is where the well-known classification methods can be applied, where pattern recognition systems play a very important role. One can learn from past (fraud happened in the past) and classify new instances (transactions) [7]. According to [1] there are some challenges faced by a fraud detection system which stem from the nature of the transaction data and some particular operational issues: The number of transactions processed by plastic card issuers daily is high, furthermore each transaction includes more than 70 fields of coded information. Transaction data is heterogeneous and time-varying within and between accounts. Patterns and trends vary significantly for different groups of merchants, holiday seasons and geographical regions. The generally accepted fraud rate within the plastic card industry is 0.1–0.2%, i.e. the occurrence of fraud is relatively rare. Frequently this leads to the problem that the majority of cases flagged by the fraud detection system as being potentially fraudulent are in fact legitimate. This type of error is referred to as false positive (FP). As the number of FPs increase so do the associated costs and customer inconvenience. Alerts arising from the fraud detection system are usually passed on to the fraud department for further investigation. The suspected cases are followed up with a call to a cardholder for verification of the transactions, where it is required by the bank policy. As a result of this, the number of alerts should be kept at a level such that it can be handled by the available number of investigators and fraud analysts. Fraudulent cases missed by the fraud detection system are reported to the issuing company when the cardholder identifies that their account has been compromised. This can take up to several months, resulting in a delay in correctly labeling each case. Some fraudulent cases remain unidentified and therefore mislabeled. Thus, a fraud detection model is almost certainly trained on noisy data. Fraud detection techniques which have been developed for a special field can be non-effective in other Sponsor: Iran Telecommunication Research Center The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012) 978-1-4673-1479-4/12/$31.00 ©2012 IEEE 029