Intermediary-based
transcoding framework
by S. C. Ihde
P. P. Maglio
J. Meyer
R. Barrett
With the rapid increase in the amount of content
on the World Wide Web, it is now becoming clear
that information cannot always be stored in a
form that anticipates all of its possible uses. One
solution to this problem is to create transcoding
intermediaries that convert data, on demand,
from one form into another. Up to now, these
transcoders have usually been stand-alone
components, converting one particular data
format to another particular data format. A
more flexible approach is to create modular
transcoding units that can be composed as
needed. In this paper, we describe the benefits
of an intermediary-based transcoding approach
and present a formal framework for document
transcoding that is meant to simplify the problem
of composing transcoding operations.
T
oday, content providers on the World Wide Web
(WWW) are under constant pressure to make in-
formation available in a variety of formats and for
a variety of purposes. For example, the Yahoo!**
catalog server provides information formatted in Hy-
perText Markup Language (HTML) for standard Web
browsers, and also provides some of this informa-
tion formatted for handheld devices such as Palm
Pilots** and wireless phones. In this case, content
is formatted differently for displays that have differ-
ent capabilities, and is also delivered differently for
devices that have different connectivity. Concern for
network bandwidth limitations in particular has
spurred many projects aimed at minimizing the
amount of data transmitted for Web transactions.
For instance, Fox and colleagues
1–3
developed a
proxy-based architecture for distilling or transform-
ing content so that thin devices receive only the data
they can handle (e.g., devices with monochrome dis-
plays do not receive color images), thus minimizing
the network bandwidth needed to transmit informa-
tion.
Moreover, as more and more companies explore the
WWW as a place to do business, large amounts of in-
formation of various types and in a variety of for-
mats will be made available on the Web. This leads
to the problem of converting data to enable appli-
cations to handle data that might come from a va-
riety of sources. In this context, bandwidth limita-
tions or client resources (e.g., CPU power and disk
space) are not a major concern. The main questions
here are: what form is the information in; what form
does it need to be in? The ability to convert content
from one form to another lets systems that use dif-
ferent languages and conventions communicate and
interoperate. The Extensible Markup Language
4
(XML) is particularly suited to the needs of businesses
to convert data from one form to another, as it pro-
vides means for specifying semantic structure.
The process of converting, distilling, or transform-
ing content is often referred to as transcoding
1,5
(see
Figure 1). In particular, this term is also used when
referring to algorithms for transforming certain data,
such as images and movies, from one format into an-
other. For example, video transcoding is the process
of converting between different compression formats,
or of further reducing the bit rate of a previously
Copyright 2001 by International Business Machines Corpora-
tion. Copying in printed form for private use is permitted with-
out payment of royalty provided that (1) each reproduction is done
without alteration and (2) the Journal reference and IBM copy-
right notice are included on the first page. The title and abstract,
but no other portions, of this paper may be copied or distributed
royalty free without further permission by computer-based and
other information-service systems. Permission to republish any
other portion of this paper must be obtained from the Editor.
IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 0018-8670/01/$5.00 © 2001 IBM IHDE ET AL.
179