Processing Probabilistic Range Queries over Gaussian-based Uncertain Data Tingting Dong 1 , Chuan Xiao 1 , Xi Guo 2 , and Yoshiharu Ishikawa 1 1 Nagoya University, Japan {dongtt,chuanx,y-ishikawa}@nagoya-u.jp 2 The Chinese University of Hong Kong, China guoxi@se.cuhk.edu.hk Abstract. Probabilistic range query is an important type of query in the area of uncertain data management. A probabilistic range query returns all the objects within a specific range from the query object with a probability no less than a given threshold. In this paper we assume that each uncertain object stored in the databases is associated with a multi-dimensional Gaussian distribution, which describes the probability distribution that the object appears in the multi-dimensional space. A query object is either a certain object or an uncertain object modeled by a Gaussian distribution. We propose several filtering techniques and an R-tree-based index to efficiently support probabilistic range queries over Gaussian objects. Extensive experiments on real data demonstrate the efficiency of our proposed approach. 1 Introduction In recent years, uncertain data management has received considerable attention in the database community. It involves a large variety of real-world applications, ranging from mobile robotics, sensor networks to location-based services. Among all the problems in the area of uncertain data management, probabilis- tic range query is an important one for processing uncertain data in real-world applications. A probabilistic range query returns all the data objects that appear within the given search region with probabilities no less than a given probability threshold. For instance, consider a self-navigated mobile robot moving in a wireless environment. The robot builds a map of the environment by observing nearby landmarks through devices such as sonar and laser range finders. Due to the inherent limitation brought about by sensor accuracy and signal noises, the lo- cation information acquired from measuring devices is not always precise. At the same time, the robot also conducts probabilistic localization [19] to estimate its own location autonomously by integrating its movement history and the land- mark information. This can cause impreciseness in the location of the robot, too. In consequence, probability queries have evolved to tackle such impreciseness; e.g., “find landmarks lying within 5 meters from my current location with a probability at least 80%”.