International Journal of Applied Engineering Research ISSN 0973-4562 Volume 10, Number 12 (2015) pp. 31269-31280 © Research India Publications http://www.ripublication.com Co-location Data Mining on Uncertain Datasets Using a Probabilistic Approach M.Sheshikala 1 , D. Rajeswara Rao 2 , and Md. Ali Kadampur 3 1 SR Engineering College, Telangana, marthakala08@gmail.com 2 KL University, Andhra Pradesh, rajeshduvvada@kluniversity.in 3 SR Engineering College, Telangana, ali.kadampur@gmail.com Abstract Uncertain data sets generally contain the real world data, such as mobile data, crime data, GIS data etc.,. Handling such data is a challenge for knowledge discovery particularly in colocation mining. Finding Probabilistic Prevalent colocations (PPCs) is one of the straight forward approach.. This method tries to find all colocations that are to be generated from a random world. For this we first apply an approximation error to find all the PPCs which reduce the computations. Next find all the possible worlds and split them into two different worlds and compute the prevalence probability. These worlds are used to compare with a minimum probability threshold to decide whether it is Probabilistic Prevalent colocation (PPCs) or not. The experimental results on the selected data set show the significant improvement in computational time in comparison to some of the existing methods used in colocation mining. Index Terms— Probabilistic Approach, Colocation Mining, Un-certain Data Sets I. INTRODUCTION asically colocation mining is the sub-domain of data mining. The research in colocation mining has advanced in the recent past addressing the issues with applications, utility and methods of knowledge discovery. Many techniques inspired by data base methods (Join based, Join-less, Space Partitioning, etc.,) have been attempted to find the prevalent colocation patterns in spatial data. Fusion and fuzzy based methods have been in use. However due to growing size of the data and computational time requirements highly scalable and computationally time efficient B