AN IMPROVED CLUSTERING ALGORITHM FOR CUSTOMER SEGMENTATION PRABHA DHANDAYUDAM Department of Computer Science and Engineering, Sri Krishna College of Engineering and Technology, Coimbatore, Tamilnadu, India prabhadhandayudam@gmail.com Dr. ILANGO KRISHNAMURTHI Department of Computer Science and Engineering, Sri Krishna College of Engineering and Technology, Coimbatore, Tamilnadu, India ilango.krishnamurthi@gmail.com Abstract Customer Segmentation is the process of grouping the customers based on their purchase habit. Data mining is useful in finding knowledge from huge amounts of data. The clustering techniques in data mining can be used for the customer segmentation process so that it clusters the customers in such a way that the customers in one group behave similar when compared to the customers in the other group based on their transaction details. The Recency (R), Frequency (F) and Monetary (M) are the important attributes that determine the purchase behavior of the customer. In this, we have provided an improved clustering algorithm for segmenting customers using RFM values and compared the performance against the traditional techniques like K-means, single link and complete link. Keywords: Customer Segmentation; Clustering; Customer Relationship Management; RFM method. 1. Introduction Customer Relationship Management (CRM) technology is a mediator between customer management activities in all stages of a relationship (initiation, maintenance and termination) and business performance [Markus Wubben (2008)]. Customer Segmentation gives a quantifiable way to analyze the customer data and distinguish the customers based on their purchase behavior [Jing Wu and Zheng Lin (2005)]. In this way the customers can be grouped into different categories for which the marketing people can employ targeted marketing and thus retain the customers. Once the customers are segmented, rules can be generated to describe the customers in each group based on their purchase behavior. These rules can be used to classify the new customers to the appropriate group who have similar purchase characteristics. RFM method is very effective for customer segmentation [Jing Wu and Zheng Lin (2005)]. R means recency which indicates the time interval between the present and previous transaction date of a customer. F means frequency which indicates the number of transactions that the customer has done in a particular interval of time. M means monetary which indicates the total value of the customer’s transaction amount. It has been proven that the values of R, F and M decide the characteristics of the customer behavior [Newell (1997)]. Data mining is the process of extracting useful information from huge volumes of data. It finds a useful application in CRM where large amount of customer data are dealt [Ngai et.al. (2009)]. Clustering technique in data mining produces clusters for the given input data where data in one cluster is more similar when compared to data in other clusters [Han and Kamber (2001)]. The similarity is measured in terms of the distance between the data. Distance can be calculated using the Manhattan distance and it is given by (1) n i i i y x y x d | ) ( | ) , ( Prabha Dhandayudam et al. / International Journal of Engineering Science and Technology (IJEST) ISSN : 0975-5462 Vol. 4 No.02 February 2012 695