Uncertainty in Semantic Schema Integration Nikos Rizopoulos 1 , Matteo Magnani 2 , Peter M c .Brien 1 , and Danilo Montesi 3 1 Department of Computing, Imperial College London {nr600,pjm}@doc.ic.ac.uk 2 Department of Computer Science, University of Bologna matteo.magnani@cs.unibo.it 3 Department of Mathematics and Informatics, University of Camerino danilo.montesi@unicam.it 1 Introduction In this paper we present a new method of semantic schema integration, based on uncertain semantic mappings. The purpose of semantic schema integration is to produce a uniﬁed representation of multiple data sources. First, schema matching [1] is performed to identify the semantic mappings between the schema objects. Then, an integrated schema is produced during the schema merging process [2] based on the identiﬁed mappings. If all semantic mappings are known, schema merging can be performed (semi-)automatically. As an illustrative example, consider the schemas S 1 and S 2 in Figure 1. Schema S 1 models a data source of undergraduate students. Undergraduates are registered (reg) in courses that are taught (tch) by staﬀ members. Schema S 2 models a data source of postgraduate students, which can also optionally register in fourth-year courses to refresh their knowledge or familiarize themselves with new subjects. Therefore, S 1 .student and S 2 .student are disjoint, while S 1 .course subsumes S 2 .course. Such semantic mappings drive the schema integration pro- cess. For example, the disjointness mapping between the student entities triggers schema transformations that rename the entities to make them distinct, e.g. into ug and pg, and add a union entity, e.g. student, that represents the union set of both undergraduate and postgraduate students. In this example, we already know the semantics of the schema objects, thus we can specify their semantic mappings. However, this is not true in general. Manual schema matching is usually time consuming and automatic schema matching is uncertain because the semantics of schema objects cannot be directly compared. In order to take into account this uncertainty in the schema matching results, we extend the concept of semantic mapping. We assume to have a ﬁnite amount of belief, that can be distributed to al- ternative semantic mappings. When we are certain about a mapping, we assign all our belief to it. This is implicitly done by most existing schema matching techniques [1]. A straightforward extension of this concept can be obtained by allowing several alternative mappings to be possible, and distributing our belief to them. For example, we may think that the two student entities in S 1 and S 2 are either disjoint, in the case they refer to undergraduates and postgraduates,