Vol 02, Issue 01, June 2013 International Journal of Data Mining Techniques and Applications http://iirpublications.com ISSN: 2278-2419 Integrated Intelligent Research (IIR) 141 Web Usage Mining Using Association Rule Mining on Clustered Data for Pattern Discovery Shaily G.Langhnoja #1 , Mehul P. Barot *2 , Darshak B. Mehta #3 # Computer Department, M.E.(Pursing) Gujarat Technical University, Gujarat, India # I.T. Department, Lecturer Govt. Polytechniqe College,Gandhinagar, Gujarat, India 1shaily09@gmail.com 3 frienddarshak@gmail.com * Associate Professor, L .D.R.P. Institute of Technology & Research, Gandhinagar, Gujarat, India 2 m_p_barot@yahoo.com Abstract— Web Usage Mining is application of data mining techniques to discover interesting usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Analyzing data through web usage mining can help effective Web site management, creating adaptive Web sites, business and support services, personalization, and network traffic flow analysis and so on. Lots of research has been done in this field while this paper emphasizes on finding user pattern in accessing website using web log record. The aim of this paper is to find user access patterns based on help of user’s session and behaviour. Web usage mining includes three phases namely pre-processing, pattern discovery and pattern analysis. In this paper combined effort of clustering and association rule mining is applied for pattern discovery. This approach helps in finding effective usage patterns. Keywords— Web Mining, Web Usage mining, Clustering, Association rules mining. I. INTRODUCTION Web mining is the application of data mining techniques to extract knowledge from Web data, in which at least one of structure or usage (Web log) data is used in the mining process. Researchers have identified three broad categories of Web mining. A. Web content mining Web content mining is the process to discover useful information from text, image, audio or video data in the web. Web content mining sometimes is called web text mining, because the text content is the most widely researched area. The technologies that are normally used in web content mining are NLP (Natural language processing) and IR (Information retrieval). B. Web structure mining Web structure mining operates on the Web’s hyperlink structure. Web structure mining is the process of using graph theory to analyze the node and connection structure of a web site. This graph structure can provide information about ranking or authoritativeness and enhance search results of a page through filtering. According to the type of web structural data, web structure mining can be divided into two kinds. The first kind of web structure mining is extracting patterns from hyperlinks in the web. A hyperlink is a structural component that connects the web page to a different location. The other kind of the web structure mining is mining the document structure. It is using the tree-like structure to analyze and describe the HTML (Hyper Text Markup Language) or XML (eXtensible Markup Language) tags within the web page. C. Web usage mining Web usage mining also known as web log mining, aims to discover interesting and frequent user access patterns from web browsing data that are stored in web server logs, proxy server logs or browser logs. Web usage mining is the application that uses data mining to analyze and discover interesting patterns of user’s usage data on the web. The usage data records the user’s behavior when the user browses or makes transactions on the web site.