An Automatic Multi-Agent Web Image and
Associated Keywords Retrieval System
Nikolaos Papadakis
Computer and
Communication Engineering
Department
University of Thessaly
Volos, Greece
nkpap@telecom.ntua.gr
Klimis Ntalianis
Department of
Telecommunications Science
and Technology
University of Peloponnese
Tripoli, Greece
kntal@image.ntua.gr
Anastasios Doulamis
Decision Support Lab.
Technical University of Crete
Chania, Greece
adoulam@cs.ntua.gr
George Stamoulis
Computer and
Communication Engineering
Department
University of Thessaly
Volos, Greece
george@inf.uth.gr
Abstract— Web-based image search engines and CBIR techniques
are blind to the actual content. As a result querying for a specific
object is often cluttered with irrelevant data, leading to low
precision. Furthermore recall rates are also very low since retrieval
procedures are usually based either on context (surrounding text)
and file captions or on low-level visual features. In this paper an
automatic multi-agent image retrieval system is proposed. Our
novel system exploits the format of multimedia sharing web sites to
discover the underlying structure in order to finally infer and
extract multimedia files and corresponding associated keywords
from the web pages. The system first identifies the section of the
web page that contains the multimedia file to be extracted and then
extracts it by using clustering techniques and other tools of
statistical origin. Experimental results on real-world image sharing
web sites are presented and discussed in this paper, indicating the
promising performance of the proposed system.
Keywords- Multimedia retrieval; automatic wrapper; multi-
resolution visualization; web mining; multi-agent web data
extraction
I. INTRODUCTION
During the last decade a rapid increase in the size of digital
image collections has been observed. As the computational
power of both hardware and software and bandwidth have
increased, the ability to store on the Web more complex data
types has been significantly improved. These new media types
demand a different treatment during search and retrieval than
pure text. Towards this direction several Content Based Image
Retrieval (CBIR) methods have been proposed [1], some of
which are based on multiple agents [2], [3], [4]. However most
methods are based on combination of low-level features, which
usually cannot provide semantic information. On the other
hand leading search engines such as Google and Yahoo retrieve
web images by checking captions, the html page content and
the surrounding text, information that may be irrelevant to the
content of an image. Thus it becomes obvious that on the one
hand it is extremely difficult to develop a generic method that
works in every web page and on the other hand visual features
lack semantics. To overcome these problems some wrapper
based methods have been proposed. For example in [5] the user
has to perform a sample query on a component called provider
and then mark the important elements in the web pages, thus
guiding the generation process of the wrapper. It also includes
another component that addresses the problem of an eventual
re-arrangement of the elements or simply the addition of some
tags in the page. Another characteristic example includes the
work in [6], which is based on two observations about data
records on the Web and the use of a string matching algorithm.
The first is that a group of data records containing descriptions
of a set of similar objects are typically presented in a particular
region of a page and are formatted using similar HTML tags.
HTML tags of a page are regarded as strings, therefore a string
matching algorithm is used to find similar HTML tags. The
second observation is that a group of similar data records being
placed in a specific region is reflected in the tag tree by the fact
that they are under one parent node which must be found.
In order to avoid human guidance and raw-tag
manipulations in this paper we propose a multi-agent system
that automatically segments web pages into structural tokens.
The proposed system is successfully applied to web image
sharing sites. In particular images are commonly presented in
HTML pages, mostly structured, but this structure is not known
in advance. The most obvious problem in designing a system
for web image extraction is the lack of homogeneity in the
structure of the source data found in web image sharing sites.
In our case, managing this task is made somewhat easier by the
fact that web image sharing sites do have some structure of
their own. The image is presented in a part of a web page while
its corresponding words are placed in another part.
This sort of structure is exploited in this paper to derive the
structure of the data. In particular a novel fully automated
multi-agent scheme is presented that is able to segment a web
page into structural tokens and select the tokens of interest
(image and associated keywords). A key step towards
retrieving the data of interest is to discover the sections
contained in a web page and identify the ones holding the
interesting information. To do that, our method is based on a
978-1-4244-4530-1/09/$25.00 ©2009 IEEE