Testing the fraud detection ability of different user profiles by means of FF-NN classifiers DRAFT – PLEASE DO NOT REDISTRIBUTE Constantinos S. Hilas 1 , John N. Sahalos 2 1 Dept. of Informatics and Communications, Technological Educational Institute of Serres, Serres GR-621 24, Greece chilas@physics.auth.gr 2 Radiocommunications Laboratory, Aristotle University of Thessaloniki, Thessaloniki GR-541 24, Greece sahalos@auth.gr Abstract. Telecommunications fraud has drawn the attention in research due to the huge economic burden on companies and to the interesting aspect of users’ behavior characterization. In the present paper, we deal with the issue of user characterization. Several real cases of defrauded user accounts for different user profiles were studied. Each profile’s ability to characterize user behavior in order to discriminate normal activity from fraudulent one was tested. Feed-forward neural networks were used as classifiers. It is found that summary characteristics of user’s behavior perform better than detailed ones towards this task. 1 Introduction Telecommunications fraud can be simply described as any activity by which telecommunications service is obtained without intention of paying [1]. Using this definition, fraud can only be detected once it has occurred. So, it is useful to distinguish between fraud prevention and fraud detection [2]. Fraud prevention is all the measures that can be used to stop fraud from occurring in the first place. These, in the case of telecommunication systems, include Subscriber Identity Module (SIM) cards or any other Personal Identification Number (PIN) like the ones used in Private PBXs. No prevention method is perfect and usually it is a compromise between effectiveness and usage convenience. Fraud detection, on the other hand, is the identification of fraud as quickly as possible once it has happened. The problem is that fraud techniques are constantly evolving and whenever a detection method becomes known, fraudsters will adapt their strategies and try others. Reference [1] provides a classification of telecommunication systems fraud and divides frauds into one of four groups, namely: contractual fraud, hacking fraud, technical fraud and procedural fraud. In [3], twelve distinct fraud types are identified. The authors of the present article have also witnessed fraudulent behavior that is a combination of the above mentioned ones [ 4]. Telecommunications fraud has drawn the attention of many researchers in the resent years not only due to the huge economic burden on companies’ accountings but also due to the interesting aspect of user behavior characterization. Fraud detection techniques involve the monitoring of users’ behavior in order to identify deviations from some expected or normal norm. Research in telecommunications fraud detection is mainly motivated by fraudulent activities in mobile technologies [1, 3, 5, 6]. The techniques used come from the area of statistical modeling like rule discovery [5, 7, 8, 9], clustering [10], Bayesian rules [6], visualization methods [11], or neural network classification [5, 12, 13]. Combinations of more than one method have also been proposed [14, 15]. In [16] one can find a bibliography on the use of data mining and machine learning methods for automatic fraud detection. Most of the aforementioned approaches use a combination of legitimate user behavior examples and some fraud examples. The aim is to detect any usage changes in the legitimate user’s history. In the present paper we are interested in the evaluation of different user representations and their effect towards the proper discrimination between legitimate and fraudulent activity. The paper proceeds as follows. In the next chapter the data that were used are described along with the