Abstract—This paper presents a novel approach to
knowledge extraction from large-scale datasets using a neural
network when applied to the real-world problem of payment
card fraud detection. Fraud is a serious and long term threat
to a peaceful and democratic society. We present SOAR
(Sparse Oracle-based Adaptive Rule) extraction, a practical
approach to process large datasets and extract key generalizing
rules that are comprehensible using a trained neural network
as an oracle to locate key decision boundaries. Experimental
results indicate a high level of rule comprehensibility with an
acceptable level of accuracy can be achieved. The SOAR
extraction outperformed the best decision tree induction
method and produced over 10 times fewer rules aiding
comprehensibility. Moreover, the extracted rules discovered
fraud facts of key interest to industry fraud analysts.
I. INTRODUCTION
raud is prevalent in many high-volume areas, such as on-
line shopping, telecommunications, banking, social
security claims, etc., where a manual review of all
transactions is not possible and decisions must be made
quickly to prevent crime. Fraud is increasing with the
expansion of computing technology and globalization, with
criminals devising new frauds to overcome the strategies
already in place to stop them. Automating the detection of
fraud, through the use of a Fraud Management System
(FMS), is therefore of strategic importance. One type of
fraud is payment card fraud – this is the criminal act of
deception through the use of a physical plastic card or card
information without the knowledge of the cardholder. When
a transaction takes place, the details of that transaction are
processed by the acquiring bank for authorization. It is
reported that in the USA total card fraud losses cost banks
and merchants $8.6 billion per year [1] and in the UK £609.9
million [2]; despite the FMS tools already in place to tackle
the problem. There are three types of fraud: (1) collusion
between a merchant and a cardholder using false
transactions, (2) committed using the physical payment card,
called Cardholder Present (CP) – such as the interception of
new credit cards in the mail, stolen/lost cards or the copying
of card information onto counterfeit physical cards,
Manuscript received May 2 2010. This work was supported in part by
Retail Decisions Europe Ltd. (http://www.redplc.com).
Nick F Ryman-Tubb is with City University London, Department of
Computing, Northampton Square, London, EC1V 0HB, UK (phone: +44 (0)
20 7040 4053; e-mail: nick.ryman-tubb@soi.city.ac.uk; web:
http://www.soi.city.ac.uk/neural).
Artur d'Avila Garcez is with City University London, Department of
Computing, Northampton Square, London, EC1V 0HB, UK (phone: +44 (0)
20 7040 4053; e-mail: aag@soi.city.ac.uk).
employee fraud at the issuing bank, etc., and (3) committed
through the use of the internet or telephone, where the
Cardholder is Not Present (CNP) at the point of transaction.
A. The Importance of Fraud Detection
Traditionally, public perceptions of fraud are tempered by
a belief that it is a “white-collar” crime which targets the
wealthy and big business and is of less personal concern, as
the effects are cushioned for the victim [3]. However, mafia
figures and other violent criminals are increasingly moving
into fraud [4] so that payment card fraud now involves the
threat of violence, including murder. In the USA, the fear of
fraud now supersedes that of terrorism, computer and health
viruses and personal safety [5] and in the UK the Attorney
General describes fraud as, “second only to drug trafficking
in causing harm to the economy and society.” [6]. Today,
the proceeds from fraud are paying for organized crime,
drug smuggling and terrorism [7, 8].
Existing FMS approaches are not keeping pace [9]; with
firms rating payment fraud as the most critical threat to their
business; “…as long as criminals believe they can get away
with committing fraud, the problem will continue to grow to
a point where it may challenge the competitiveness of the
online model”. If anti-fraud technologies do not keep pace
businesses lose money from: charge-backs and fines, loss of
goods, loss of reputation with their payment card facilities
withdrawn and in some cases business failure. To detect
fraud, organizations use a range of methods, at the most
basic level this is a list of internal procedures such as fixed
credit limits, transaction volume limits and so on. However,
only a small number now rely on manual methods alone,
with the majority employing some form of automated FMS.
The FMS is often a rule-based system that stores and uses
knowledge in a transparent way and is easy for a fraud
expert to modify and interpret. Rules provide a convenient
mechanism for explaining decisions. However, the
generation of comprehensible rules is an expensive and
time-consuming task, requiring a high degree of skill, both
in terms of the developers and the experts concerned. The
performance of the FMS is dependent upon the skill of the
human expert and how past data and events are interpreted.
Experts are often subjective and can only deal with a limited
number of transaction fields. While it was found that such
systems could be easily understood and provide an initial
level of success in automating fraud decision making, often
their accuracy worsened over time. To try to improve the
accuracy more rules are added by the experts, but the system
then becomes increasingly complex, slower to process and
SOAR – Sparse Oracle-based Adaptive Rule Extraction: Knowledge
extraction from large-scale datasets to detect credit card fraud
Nick F Ryman-Tubb, Member, IEEE, Artur d'Avila Garcez
F
978-1-4244-8126-2/10/$26.00 ©2010 IEEE