International Journal of Applied Engineering Research
ISSN 0973-4562 Volume 10, Number 12 (2015) pp. 31269-31280
© Research India Publications
http://www.ripublication.com
Co-location Data Mining on Uncertain Datasets Using
a Probabilistic Approach
M.Sheshikala
1
, D. Rajeswara Rao
2
, and Md. Ali Kadampur
3
1
SR Engineering College, Telangana, marthakala08@gmail.com
2
KL University, Andhra Pradesh, rajeshduvvada@kluniversity.in
3
SR Engineering College, Telangana, ali.kadampur@gmail.com
Abstract
Uncertain data sets generally contain the real world data, such as mobile data,
crime data, GIS data etc.,. Handling such data is a challenge for knowledge
discovery particularly in colocation mining. Finding Probabilistic Prevalent
colocations (PPCs) is one of the straight forward approach.. This method tries
to find all colocations that are to be generated from a random world. For this
we first apply an approximation error to find all the PPCs which reduce the
computations. Next find all the possible worlds and split them into two
different worlds and compute the prevalence probability. These worlds are
used to compare with a minimum probability threshold to decide whether it is
Probabilistic Prevalent colocation (PPCs) or not. The experimental results on
the selected data set show the significant improvement in computational time in
comparison to some of the existing methods used in colocation mining.
Index Terms— Probabilistic Approach, Colocation Mining, Un-certain Data
Sets
I. INTRODUCTION
asically colocation mining is the sub-domain of data mining. The research in
colocation mining has advanced in the recent past addressing the issues with
applications, utility and methods of knowledge discovery. Many techniques inspired
by data base methods (Join based, Join-less, Space Partitioning, etc.,) have been
attempted to find the prevalent colocation patterns in spatial data. Fusion and fuzzy
based methods have been in use. However due to growing size of the data and
computational time requirements highly scalable and computationally time efficient
B