I Identity Aware LBS ⊲ Privacy Threats in Location-Based Services Identity Unaware LBS ⊲ Privacy Threats in Location-Based Services iDistance Techniques H.V. Jagadish 1 , Beng Chin Ooi 2 , and Rui Zhang 3 1 Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA 2 Department of Computer Science, National University of Singapore, Singapore, Singapore 3 Department of Computer Science and Software Engineering, The University of Melbourne, Parkville, VIC, Australia Synonyms Query, Nearest Neighbor; Scan, Sequential Deﬁnition The iDistance is an indexing and query pro- cessing technique for k nearest neighbor (kNN) queries on point data in multi-dimensional met- ric spaces. The kNN query is one of the hard- est problems on multi-dimensional data. It has been shown analytically and experimentally that any algorithm using hierarchical index structure based on either space- or data-partitioning is less efﬁcient than the naive method of sequentially checking every data record (called the sequen- tial scan) in high-dimensional spaces (Weber et al. 1998). Some data distributions including the uniform distribution are particularly hard cases (Beyer et al. 1999). The iDistance is designed to process kNN queries in high-dimensional spaces efﬁciently and it is especially good for skewed data distributions, which usually occur in real- life data sets. For uniform data, the iDistance beats the sequential scan up to 30 dimensions as reported in Jagadish et al. (2005). Building the iDistance index has two steps. First, a number of reference points in the data space are chosen. There are various ways of choosing reference points. Using cluster centers as reference points is the most efﬁcient way. Second, the distance between a data point and its closest reference point is calculated. This distance plus a scaling value is called the point’s iDistance. By this means, points in a multi-dimensional space are mapped to one-dimensional values, and then a B C -tree can be adopted to index the points using the iDistance as the key. A kNN search is mapped to a number of one-dimensional range searches, which can be processed efﬁciently on a B C -tree. The iDistance technique can be viewed as a way of accelerating the sequential scan. Instead of scanning records from the beginning to the end © Springer International Publishing AG 2017 S. Shekhar et al. (eds.), Encyclopedia of GIS, DOI 10.1007/978-3-319-17885-1