MAXSM: A Multi-Heuristic Approach to XML Schema Matching Mirza Beg, Laurent Charlin, Joel So David R. Cheriton School of Computer Science University of Waterloo {mbeg, lcharlin, j2so}@cs.uwaterloo.ca December 11, 2006 Abstract Transformation of business messages from one trading partner’s definition to an- other, or from one business message type to another is a common requirement for enterprise data integration applications. Transforming these business messages entails resolving issues of structural and semantic heterogeneity between their schemas. In this paper, we propose an automatic schema matching approach called MAXSM, designed specifically for matching schemas in the context of enterprise data integration. MAXSM is a multi-heuristic schema matcher which employs, amongst other heuristics, a novel heuristic using WordNet for determining natural-language semantic similarity between schemas. MAXSM also introduces “transitive mappings” that can be discovered from known mappings and used to seed production of candidate mappings. We present a cogent tree spanning approach to search two schemas more effectively for node-level match candidates. Keywords: XML Schema, Schema Matching, Schema Mapping, Semantic Matching, Tree Matching, Maximum Matching Algorithms, Location Path 1 Introduction The adoption of XML to represent and communicate business information has increased significantly over the last decade and continues to see sustained growth. Thus, data inte- gration scenarios in the context of electronic commerce, business-to-business integration, and enterprise application integration commonly involve the transfer and transformation of XML documents. Transformation of business messages from one trading partner’s defini- tion to another, or from one business message type to another (within or across business processes) is a common requirement of most any enterprise data integration application. In order to transform between these business messages, corresponding schema mappings must first be defined. Copyright c 2006 Mirza Beg, Laurent Charlin, and Joel So. Permission to copy is hereby granted provided the original copyright notice is reproduced in copies made. 1