2736 | International Journal of Current Engineering and Technology, Vol.4, No.4 (Aug 2014)
Review Article
International Journal of Current Engineering and Technology
E-ISSN 2277 – 4106, P-ISSN 2347 - 5161
©2014 INPRESSCO
®
, All Rights Reserved
Available at http://inpressco.com/category/ijcet
A Summarized Review on Web Usage Mining
Tayyaba Ashraf
Ȧ*
and Imran Ashraf
Ȧ
Ȧ
IT Department, University of Gujrat, Gujrat Pakistan
Accepted 10 August 2014, Available online 25 Aug 2014, Vol.4, No.4 (Aug 2014)
Abstract
Incremental growth in the use of web based services and systems have led to generation of such tremendous amounts of
data which is beyond imagining. This data plays a vital role in determining the factors like user’s interests, priorities
and product or services usage trends. This knowledge enables organizations to evaluate the effectiveness of their
strategies and quality of services/products provided and leads them for further refinement. Solicitation of data mining
approaches to process huge volume of data available on net is termed as web mining. Web usage is a further phase in
web mining which discovers data about the use of internet. This paper aims at providing a review of phases and
techniques involved in web usage mining.
Keywords: Web-mining, Preprocessing, Sequential Patterns, Proxy Level Patterns.
1. Introduction
1
In this rapidly growing age of information technology data
has gained crucial importance for every organization. It is
the most valuable asset for organizations in this era. Due
to rapid emergence of electronic data management
methods this age is called Information age (Goebel e
Gruenwald, 1999).Each organization has a huge volume of
data and it is very difficult and often impossible to handle
that data without any computer based application. In
addition to data management, analysis of such big
collection of data is also a huge problem. Today’s
databases contain a huge volume of data that manual
analysis and valuable decision making is not possible. In
many cases a lot of independent fields need to be analyzed
at a same time to get accurate results (Goebel e
Gruenwald, 1999).Therefore humans require support to
improve their analysis ability. The need for automated
extraction of relevant data from a huge volume of data is
widely recognized now. It leads to discover more efficient
techniques for this purpose.
This review paper aims to collect and analyze the
major approaches which have appeared in web about
extraction of web data and provides an overview on
mining phases which are most prominent regarding this
and most recent trends in it. This paper is divided into five
sections. After introduction section 2 gives an overview of
related work and discusses in brief data mining in general.
Section 3 is about web data, its usage and its structure.
Section 4 highlights data foundations for web. Section 5
discussed the approaches used in web usage mining. In the
end conclusion is given.
*Corresponding author: Tayyaba Ashraf; Imran Ashraf is working as
Lecturer in CS & IT Department
2. Terminologies and Background
This era is of computer (Khushboo, Vekariya e Mishra,
2012) and electronic information (Han e Kamber, 2006);
every sphere of life is based on accurate and timely
available data. As a result a huge collection of data is
produces in the field of science, medical, marketing and
finance etc. (Anwer, Rashid e Hassan, 2010). Automated
systems are required for systematize summarization,
exploration, and classification of available data. It is
helpful for management to take timely and related
decisions. A lot of research areas like mathematics,
artificial intelligence and meditation are involved to
develop such automated systems (Gibson, Kleinberg e
Raghavan, 1998; Pei et al., 2000; Kohavi e Provost, 2001;
Anwer, Rashid e Hassan, 2010; D.S.Deshpande, 2012;
Seerat e Azam, 2012; Shelke, Deshpande e Thakre, 2012).
Fig.1 A hierarchical view of web usage mining
A large number of applications are presented to store and
extract data from huge collections. Such computer based
tools and methods are topic of discussion about
Knowledge extraction in Database and Text mining is an
interdisciplinary field which is used in different areas like
Data mining
Text Mining
Web Mining
• Web usage Mining