“Almost automatic” and Semantic Integration of XML Schemas at Various “Severity” Levels Pasquale De Meo 1 , Giovanni Quattrone 1 , Giorgio Terracina 2 , and Domenico Ursino 1 1 DIMET, Universit` a Mediterranea di Reggio Calabria, Via Graziella, Localit` a Feo di Vito, 89060 Reggio Calabria, Italy, {demeo, quattrone}@ing.unirc.it, ursino@unirc.it 2 Dipartimento di Matematica, Universit` a della Calabria, Via Pietro Bucci, 87036 Rende (CS), Italy terracina@mat.unical.it Abstract. This paper presents a novel approach for the integration of a set of XML Schemas. The proposed approach is specialized for XML, is almost automatic, semantic and “light”. As a further, original, peculiar- ity, it is parametric w.r.t. a “severity” level against which the integra- tion task is performed. The paper describes the approach in all details, illustrates various theoretical results, presents the experiments we have performed for testing it and, finally, compares it with various related approaches already proposed in the literature. 1 Introduction The Web is presently playing a key role for both the publication and the ex- change of information among organizations. As a matter of fact, it is becoming the reference infrastructure for most of the applications conceived to handle interoperability among partners. In order to make Web activities easier, W3C (World Wide Web Consortium) proposed XML (eXtensible Markup Language) as a new standard information exchange language that unifies representation capabilities, typical of HTML, and data management features, typical of classical DBMS. The twofold nature of XML allowed it to gain a great success and, presently, most of the new documents published on the Web are written in XML. However, from the data management point of view, XML documents alone have limited and primitive capabilities. In order to improve these capabilities, in such a way to make them similar to those typical of classical DBMS, W3C proposed to asso- ciate XML Schemas with XML documents. An XML Schema can be considered as a sort of catalogue of the information typologies that can be found in the corresponding XML documents; from another point of view, an XML Schema defines a reference context for the corresponding XML documents. Certainly, XML exploitation is a key step for improving the interoperability of Web information sources; however, that alone is not enough to completely fulfill such a task. Indeed, the heterogeneity of data exchanged over the Web regards R. Meersman et al. (Eds.): CoopIS/DOA/ODBASE 2003, LNCS 2888, pp. 4–21, 2003. c Springer-Verlag Berlin Heidelberg 2003