IJSRST1845464 | Received : 05 April 2018 | Accepted : 20 April 2018 | March-April-2018 [ (4) 5 : 1678-1684]
© 2018 IJSRST | Volume 4 | Issue 5 | Print ISSN: 2395-6011 | Online ISSN: 2395-602X
Themed Section: Science and Technology
1678
Comparative Analysis of On-Shelf Utility Mining Algorithm
Dr. S. Vijayarani
1
, Mrs. C. Sivamathi
2
,Ms. V. Jeevika Tharini
3
1
Assistant Professor, Department of Computer Science, Bharathiar University, Coimbatore, Tamilnadu, India
2
Ph. D Research Scholar, Department of Computer Science, Bharathiar University, Coimbatore, Tamilnadu,
India
3
PG student, Department of Computer Science, Bharathiar University, Coimbatore, Tamilnadu, India
ABSTRACT
Data mining is a process of retrieving previously unknown and needed patterns from database. Utility mining is
one of the important fields in data mining. Utility mining is a process of finding high utility itemsets from a
database. An item is termed as high utility item if the item’s utility is more than minimum threshold value.
Utility of an item is based on user’s interest or preference. Recently, temporal data mining has become a core
data processing technique to deal with the changing data. On-shelf utility mining includes the on-shelf time
period of item and gets the exact utility values of itemsets in temporal database. In traditional on-shelf utility
mining, profits of all items in databases are considered as positive values. However, in real applications, some
items may have negative profit. In this work both FOSHU (Faster On-Shelf High Utility) and TS-HOUN
(Three-Scan Algorithm for Mining On-shelf High Utility Itemsets with Negative profit) algorithms are
compared and their performances were measured.
Keywords : Utility mining, On-Shelf utility mining, temporal database, relative utility, periodical utility.
I. INTRODUCTION
Data mining is the process of extracting interesting
information or patterns from large information
repositories. It task includes finding association rules,
classification rules, clustering rules. Among those data
mining, association rule mining is the most popular
task in data mining. It has two phases. In first phase, it
discovers all the frequent itemsets based on a user-
defined minimum support threshold value. In second
phase, it generates the association rules from the
discovered frequent itemsets based on the user-
defined minimum confidence threshold value. In this,
frequent itemsets considers only the frequency of an
item in a database. The relative importance such as
price, weight or profit of an item inside a transaction
is not considered. However, in real world business,
some items or itemsets with low support in the data
set may bring high profits due to their high price or
high frequency within transactions. Such useful,
profitable itemsets are missed by frequent itemset
mining [1].
In Weighted Frequent itemset mining,
weights of each item such as unit profits of items in
the databases are considered. If items appear
infrequently, they might still be found if they have
high weights. But in this framework, the quantities of
items are ignored. Therefore it cannot satisfy the
requirements of users who are interested in finding
the itemsets with consideration of both quantity and
profit [2]. Recently, Utility itemset mining [3] has
been proposed to eliminate the limitation of frequent
and weighted itemset mining. It considers utility of an
item which is based on interesting measures like user’s
preference or frequent patterns of interest. Utility
mining measures the importance of an item. Thus it is
useful in real world market data analysis. Utility of an
item in a database is the product of external and local
transaction utility values. The local transaction utility
is defined as the quantity of a item and the external
utility is the profit of a item in utility mining. The
utility of an itemset is calculated by the product of
quantity and profit. If utility of an itemset is greater
than the threshold (predefined (user defined)