XML Document Versioning and Revalidation ⋆ Jakub Mal´ y, Jakub Kl´ ımek, Irena Ml´ ynkov´ a, and Martin Neˇ cask´ y XML Research Group, Department of Software Engineering Faculty of Mathematics and Physics, Charles University in Prague Malostransk´ e n´ amˇ est´ ı 25, 118 00 Praha 1 The Czech Republic {klimek,maly,mlynkova,necasky}@ksi.mff.cuni.cz Abstract. One of the prominent characteristics of XML applications is their dynamic nature. When a system grows and evolves, old user require- ments change and/or new requirements accumulate. Apart from changes in the interface, it is also necessary to modify the existing documents with each new version, so they are valid against the new specification. The approach presented in this work extends an existing conceptual model with the support for multiple versions of the model. Thanks to this ex- tension, it is possible to define a set of changes between two versions of a schema. This work contains an outline of an algorithm that compares two versions of a schema and produces a revalidation script in XSL. Keywords: XML, schema, schema evolution 1 Introduction Recently, XML [12] has become a corner stone of many information systems. It is a de facto standard for data exchange (i.e. Web services [4]) and it is also a popular data model in databases [3]. XML schemas are utilized in two scenarios: 1) to describe structure and check validity of internal documents, and 2) to define the interface of a component of the system for other components or of the system to the outer world. Requirements change during the life cycle of the system and so do the XML schemas. Without any tools to help, the old and new schema need to be examined by a domain expert. Each change must be identified, analyzed and all the relevant parts of the system modified and the existing documents up- dated. This can be a time-consuming and error-prone process, but, in fact, a significant portion of the operations could be performed automatically. The existing approaches work directly at the level of XML schemas, which leads to problems with recognizing the semantics of each change. Our approach utilizes the conceptual model for understanding the semantics. Also, most of the existing approaches require the user to either revalidate the documents after ⋆ This work was supported in part by the Czech Science Foundation (GA ˇ CR), grant number P202/10/0573, and by the grant SVV-2011-263312. V.Sn´aˇ sel, J. Pokorn´ y, K. Richta (Eds.): Dateso 2011, pp. 49–60, ISBN 978-80-248-2391-1.