International Journal of Data Warehousing & Mining, 3(2), 1-11, April-June 2007 1
Copyright © 2007, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc.
is prohibited.
AbstrAct
The 10th Pacifc-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 2006
Data Mining Competition involved the problem of classifying mobile telecom network customers
into second generation (2G) and third generation (3G) services, with the ultimate aim of identi-
fying existing 2G network customers who had a high potential of switching to using the mobile
operator’s new 3G mobile network and services. This paper discusses the background behind
the preparation of the dataset, the choice of judging criteria incorporating both a quantitative
measure of accuracy and a set of subjective qualitative assessments, and fnally a summary of
the participation and results. We also highlight, in the report, interesting observations and fnd-
ings from some of the participating teams.
Keywords: balanced accuracy rate; data mining competition; qualitative assessment; Telco
business problem
IntroductIon
The 10th PAKDD, a leading inter-
national conference in the areas of data
mining and knowledge discovery, was held
in Singapore in May 2006. PAKDD, in
association with the Singapore Institute of
Statistics (SIS) and the Pattern Recognition
and Machine Intelligence Association of
Singapore (PREMIA), was able to organize
a data mining competition for the confer-
ence in 2006 based on a telecom operator
business problem.
1
Although it was mentioned in the
conference proceedings (Ng, Kisuregawa,
Li, & Chang, 2006) that it was the frst
time that PAKDD held such a competi-
tion, actually there is record of one other
competition using medical data organized
during the Fourth PAKDD conference in
Kyoto, Japan in 2000. In any case, the main
point would be that PAKDD has not been
holding regular annual competitions like the
ACM SIGKDD International Conference
on Knowledge Discovery and Data Min-
ing (which has been organizing an annual
KDD Cup since 1997
2
) or the European
Conference on Principles and Practice of
Knowledge Discovery in Databases (with
A Look back at the PAKdd data
Mining competition 2006
Nathaniel B. Noriel, Singapore Institute of Statistics, Singapore
Chew Lim Tan, National University of Singapore, Singapore
IGI PUBLISHING
This paper appears in the publication, International Journal of Data Warehousing and Mining, Volume 3, Issue 2
edited by David Taniar © 2007, IGI Global
701 E. Chocolate Avenue, Suite 200, Hershey PA 17033-1240, USA
Tel: 717/533-8845; Fax 717/533-8661; URL-http://www.igi-pub.com
ITJ3621