Ruixin Yang,Menas Kafatos,
and X. Sean Wang
George Mason University
Managing Scientific
Metadata Using XML
This XML-based distributed system manages scientific
metadata in various formats and supports sophisticated
search and interactive data-access capabilities.
S
atellites and other Earth-observing
systems produce huge amounts of
data at ever-expanding rates. The
EOS satellite Terra alone adds more than
half a terabyte of data each day,
1
and
other Earth-observing platforms and
computer weather and climate models
produce even more. To use this data in
their research effectively, scientists need
distributed user-centric information sys-
tems with effective search, analysis, and
ordering capabilities.
A data access and analysis system
must let scientific data users find, evalu-
ate, access, and use data online regardless
of its location or format. A data-delivery
mechanism as simple as FTP is useful for
data exchange, but its limitations are
obvious. The Distributed Oceanographic
Data System (DODS, www.unidata.ucar.
edu/packages/dods), which enables dis-
tributed access to online digital data, is a
more sophisticated data-delivery system.
When one integrates DODS with the Grid
Analysis and Display System (Grads),
2
the
resulting GDS
3
lets users define opera-
tions performed on the server and obtain
the resultant data. However, GDS does
not possess enough searchable metadata
to let users locate data quickly.
In this article, we present our XML-
based Distributed Metadata Server
(Dimes)
4
— which comprises a flexible
metadata model, search software, and a
Web-based interface — to support mul-
tilevel metadata access, and introduce
two prototype systems. Our Scientific
Data and Information Super Server
(SDISS), which is based on Dimes and
GDS, solves accurate data-search and
outdated data-link problems by in-
tegrating metadata with the data sys-
tems. On the implementation front, we
combine independent components and
open-source technologies into a coher-
ent system to dramatically extend
system capabilities. Obviously, our ap-
proach can be applied to other scientif-
ic communities, such as bioinformatics
and space science.
52 JULY • AUGUST 2002 http://computer.org/internet/ 1089-7801/02/$17.00 ©2002 IEEE IEEE INTERNET COMPUTING
Database Technology on the Web