I
Identity Aware LBS
⊲ Privacy Threats in Location-Based Services
Identity Unaware LBS
⊲ Privacy Threats in Location-Based Services
iDistance Techniques
H.V. Jagadish
1
, Beng Chin Ooi
2
, and
Rui Zhang
3
1
Department of Electrical Engineering and
Computer Science, University of Michigan, Ann
Arbor, MI, USA
2
Department of Computer Science, National
University of Singapore, Singapore, Singapore
3
Department of Computer Science and Software
Engineering, The University of Melbourne,
Parkville, VIC, Australia
Synonyms
Query, Nearest Neighbor; Scan, Sequential
Definition
The iDistance is an indexing and query pro-
cessing technique for k nearest neighbor (kNN)
queries on point data in multi-dimensional met-
ric spaces. The kNN query is one of the hard-
est problems on multi-dimensional data. It has
been shown analytically and experimentally that
any algorithm using hierarchical index structure
based on either space- or data-partitioning is less
efficient than the naive method of sequentially
checking every data record (called the sequen-
tial scan) in high-dimensional spaces (Weber
et al. 1998). Some data distributions including the
uniform distribution are particularly hard cases
(Beyer et al. 1999). The iDistance is designed to
process kNN queries in high-dimensional spaces
efficiently and it is especially good for skewed
data distributions, which usually occur in real-
life data sets. For uniform data, the iDistance
beats the sequential scan up to 30 dimensions as
reported in Jagadish et al. (2005). Building the
iDistance index has two steps. First, a number
of reference points in the data space are chosen.
There are various ways of choosing reference
points. Using cluster centers as reference points
is the most efficient way. Second, the distance
between a data point and its closest reference
point is calculated. This distance plus a scaling
value is called the point’s iDistance. By this
means, points in a multi-dimensional space are
mapped to one-dimensional values, and then a
B
C
-tree can be adopted to index the points using
the iDistance as the key. A kNN search is mapped
to a number of one-dimensional range searches,
which can be processed efficiently on a B
C
-tree.
The iDistance technique can be viewed as a way
of accelerating the sequential scan. Instead of
scanning records from the beginning to the end
© Springer International Publishing AG 2017
S. Shekhar et al. (eds.), Encyclopedia of GIS,
DOI 10.1007/978-3-319-17885-1