Visual Media Retrieval Using Transform-Based
Layered Query Scheme
Esin Guldogan and Moncef Gabbouj
Institute of Signal Processing, Tampere University of
Technology, Tampere, Finland
E-mail: esin.guldogan@tut.fi, moncef.gabbouj@tut.fi
Olcay Guldogan
Nokia Technology Platforms,
Tampere, Finland
E-mail: olcay.guldogan@nokia.com
Abstract—This paper presents a visual media querying scheme
referred to as Transform-Based Layered Query (TLQ) Scheme.
The TLQ scheme mainly aims at decreasing retrieval processing
time and run-time memory consumption without degrading
retrieval results semantically. The scheme contains abstract
layers in indexing and retrieval phases, where each indexing
layer corresponds to a retrieval layer. The layers are constructed
based on transformations for reducing visual frame and feature
data dimensions. The proposed TLQ scheme also involves an
unsupervised method for eliminating irrelevant media items
between the retrieval layers. A two-layer TLQ system is
implemented and integrated into MUVIS content-based
multimedia indexing and retrieval framework, and its theoretical
advantages are verified with dedicated experiments on image and
video databases. The experiments reveal that 75% retrieval
performance improvement in terms of process time can be
achieved depending on transformation parameters.
Keywords—content-based indexing and retrieval; retrieval
optimization; query system.
I. INTRODUCTION
Recent technology improvements along with the Internet
growth have led to huge amount of digital multimedia during
the recent decades. Various methods, algorithms and systems
have been proposed addressing multimedia storage and
management problems. Such studies revealed the indexing
and retrieval concepts, which have further evolved to Content-
Based Multimedia Indexing and Retrieval (CBMIR) [1], [2],
[3]. Despite various successful systems, there is no perfect
global solution for CBMIR in general.
CBMIR systems often analyze multimedia content via so-
called low-level features for indexing and retrieval, such as
color, texture and shape. Recent systems intend to combine
low and high-level features for achieving significantly higher
semantic performance. However, considering such
combinations makes retrieval more complex and time-
consuming process. Additionally, feature extraction
processing time and memory requirements are becoming more
important problems.
Due to high memory and processing power requirements,
CBMIR has not been widely used on limited platforms, such
as mobile devices or distributed systems. Nevertheless, the
usage of CBMIR systems on these platforms is becoming
widespread. Hence, the performance optimization of indexing
and retrieval plays an important role in practical CBMIR
studies. Retrieval performance optimization is more visible for
the end-user of a CBMIR system, although indexing affects
retrieval directly. Query performance optimization during
retrieval consists of three main groups of problems:
• Processing time and computational complexity,
• Disk and run-time memory space requirements, and
• Semantic retrieval performance.
Transform-Based Layered Query (TLQ) System is a new
visual multimedia querying scheme for increasing query
performance without degrading semantic performance. TLQ is
further described in Section 2. A sample TLQ system
implementation integrated into MUVIS [1] content-based
multimedia indexing and retrieval framework is presented in
Section 3. The theoretical benefits of the implemented system
and its experimental results are also given in Section 3.
Finally Section 4 presents the concluding remarks and
discussions.
II. TRANSFORM BASED LAYERED QUERY (TLQ) SCHEME
A. TLQ System Structure
Transform-Based Layered Query (TLQ) is a querying
system for multimedia databases that are indexed so-called
indexing/querying layers. It mainly aims at reducing retrieval
processing complexity, time and memory consumption. As
shown in the transformation scheme illustrated by Figure 1,
the concerning layers are constructed based on three
transforms: T1, T2 and T3. T1 represents an optional
transformation working on visual media, where T3 represents
a similar optional transformation working only on video data.
T2 represents a compulsory transformation working on feature
data. Although TLQ system does not directly depend on any
specific transformations, underlying framework and
transformations should follow the assumptions and restrictions
below for achieving overall system targets:
- Indexing process and feature extraction depends on
frame size in terms of time, memory usage and
complexity.
- Video indexing process also depends on video key-
frames in terms of time, memory usage and
complexity.
- Query process depends on feature data size in terms
of time, memory usage and complexity.
0-7803-9134-9/05/$20.00 ©2005 IEEE