Generating XSLT with a Semantic Hub Transformations for the Semantic Web Joshua Fox <joshua@unicorn.com> Abstract XSLT is the standard technique for integrating heterogeneous XML-based applications, whether in messaging environments like EAI or in request/response systems like the Semantic Web. With the techniques commonly used today, writing XSLT to integrate multiple XML formats requires significant effort. The usual procedure requires an analysis of both the semantics and the structures of the source and target XML files, often by examining instance documents. Manual coding then follows this analysis. A partial step towards in reducing the time and errors in this procedure is to use schemas or DTDs; software can use these to assist in developing the XSLT. Currently available graphical tools can do this, but they still require developers to manually indicate mappings for each source- target pair. The heart of the problem is that schemas only formalize the documents’ structure, not their semantics. As a result, the XSLT developer must make the effort in each case to re-analyze the "meaning" of each schema’s XML tags. The complexity of this repeated effort in developing point-to-point XSLTs rises quadratically (O(n 2 )) in the number of schemas that must be integrated. Such XSLTs cannot be maintained or reused as schemas change and as new schemas are added. A new solution involves capturing the semantics (meaning) of the schemas and using these semantics to automatically generate the necessary XSLT's. The developer first defines a rich information model using ontology, a formal technique repres- enting real-life semantics through concepts such as classes, relationships, and inheritance. This model is itself valuable in clarifying the application domain. The developer then maps the schema's Elements, Complex Types, and Simple Types to the information model, thereby formally capturing the schemas' semantics. In the next step, the active semantic hub is used to generate the XSLT based on the elements' meanings. The algorithm finds elements of the source and target that are mapped to the same ontological concepts, or to concepts that can be related to each other with encoded conversion rules. This gives the well-known advantage of linear (O(n)) complexity in a hub architecture as opposed to the quadratic complexity of point-to-point solutions. In the final step, the XSLT is deployed for runtime, for example in an EAI message broker or in a Semantic Web application. This deployment can be manual or automated. Table of Contents 1. Introduction ............................................................................................................................... 2 2. XSLT: Hard to Write ................................................................................................................... 2 2.1. A Functional, XML-based Language .................................................................................... 2 2.2. XSLT: XML-based Syntax ................................................................................................ 3 3. XSLT Development Today ........................................................................................................... 3 3.1. The Most Common Development Technique ......................................................................... 3 3.2. Manual Schema-to-Schema Development ............................................................................. 3 3.3. Schema-less XSLT Generation Tools ................................................................................... 3 3.4. XSLT-Generation Tools for HTML ..................................................................................... 3 XML 2002 Proceedings by deepX 1 Rendered by www.RenderX.com