Information Processing Letters 19 (1984) 61-65 31 August 1984
North-Holland
PARTIAL MATCH RETRIEVAL IN IMPLICIT DATA STRUCTURES
Helmut ALT
Department of Computer Science, The Pennsylvania State University, University Park, PA 16802, U.S.A.
Kurt MEHLHORN
Fachbereich Informatik, Universiti~t des Saarlandes, 6600 Saarbri~cken, Fed. Rep. Germany
J. Ian MUNRO
Data Structuring Group, Department of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
Communicated by J. Nievergelt
Received 10 December 1983
Revised 20 March 1984
Keywords: Data management, analysis of algorithms, combinatorial problems, data structures
1. Introduction
We consider a set of data which contains n
records, each of them consisting of k keys.
A partial match query is one in which an arbi-
trary subset of the keys of a record is specified.
Problems of this type have been studied by a
number of researchers including Bentley [3], Ghosh
and Abraham [6] and Rivest [9]. Our attention,
unlike most, though not all, previous work, is
restricted to structures which are implicit in that
no information other than the data itself (and the
number of records) is explicitly stored. That is, we
are interested in storage schemes which for fast
retrieval retain no pointers and represent the file
of n k-key records as a simple n by k array T. Any
structural information must be encoded in the
order of the records.
The model of computation is the simple com-
parison model, as it is assumed that the keys are
chosen from some arbitrarily large space. The
measure of difficulty to perform a search is the
maximum number of comparisons required when
any j of k keys have been specified.
For example, in the case of a 2-key file, if the
file were sorted according to the first key searches
could be performed quickly if key 1 were given.
However, one could do no better than a sequential
search if only the second key were specified. Clearly
some 'balance' between the ordering on the keys is
required.
2. An upper bound for partial match retrieval
The following procedure (cf. Munro [7]) orders
the records in a way such that the partial match
retrieval can be done efficiently. Initially, the only
segment is the entire file itself. Assume the keys
are numbered 0, 1 ..... k - 1.
procedure Order;
begin i,= 0
while there remain segments of more than 1
element do
begin
for each segment remaining from the pre-
vious stage do
begin
0020-0190/84/$3.00 © 1984, Elsevier Science Publishers B.V. (North-Holland)
61