TNTBase – a Versioned XML Database Vyacheslav Zholudev and Christoph Lange Computer Science, Jacobs University Bremen, {v.zholudev,ch.lange}@jacobs-university.de Abstract. A huge amount of documents is created and changed in our everyday life, so that Version Control Systems like Git or SVN are tightly integrated with documents workflows. On the other hand, XML has come of age as a basis for document formats, and even though XML as a text- based format is suitable for version control in principle, the fact that version control systems work on files makes the integration of fragment access techniques like XPath or XQuery difficult. In this paper, we present the state of the art of TNTBase, a versioned XML database based on Berkeley DB XML and Subversion. Thus, the system integrates versioning and fragment access needed for fine-granular document content management. It is intended as a basis for collabora- tively editing and sharing XML documents, and also provides an infras- tructure for specialization towards specific applications and their docu- ment formats, such as validation, format-specific “XML-database views” and human-oriented presentation. 1 Introduction A tremendous number of documents is being created and managed nowadays all over the world. We observe the development of a deep web (web content stored in databases), from which the surface web (what we see in our browsers) is gen- erated. With the merging of XML fragment access techniques (most notably URIs [6] and XPath [3]) and database techniques and the ongoing development of XML-based document formats, we are seeing the beginnings of a deep web of XML documents, where surface documents are assembled, aggregated and mashed up from background information in XML databases by techniques like XQuery [7], and document (fragment) collections are managed by XQuery Up- date [8]. At the same time, the web is constantly changing. Therefore, we need an infrastructure for managing changes in the XML-based deep web. Unfortu- nately, Version Control systems like Git or Subversion [26] which have heavily influenced collaboration workflows in software engineering are deeply text-based (w. r. t. diff/patch/merge) and do not integrate well with XML databases and XQuery. On the other hand, some XML databases address temporal aspects, but their versioning possibilities cannot be compared to traditional Version Control systems. Moreover, the latter do not really provide format-specific features based on format schemas or semantics.