Hydra Version Control System Christoph P. Neumann, Scott A. Hady, Richard Lenz Computer Science 6 (Data Management) Friedrich-Alexander University, Erlangen, Germany christoph.neumann@cs.fau.de Abstract —The Hydra project offers version control for distributed case files in α-Flow. Available version control systems lack support for independent version- ing of multiple logical units within a single repository each with its own version history and head. Our use case also requires mechanisms for labeling versions by their validity and for validity-based navigational access. Hydra is a multi-module and validity-aware version control system. Keywords -Versioning, Multi-Module, Case Files, Valid Paths I. Introduction & Background The research project α-Flow (cf. [1], [2]) provides an approach to distributed electronic case files, called α- Docs. α-Docs are distributed by storing a local copy for different cooperating parties and by synchronizing updates and extensions among these copies (cf. [3]). The α-Doc contains its own execution environment (cf. [4]); its subsystem implementations must be as small as pos- sible. The motivation for α-Docs is not important in this context. From the perspective of version control systems (VCS), the α-Doc contains a repository that is structured into data modules, so called α-Card units. Each data module, i.e. each α-Card unit, is an independent set of hierarchically structured files. Hydra provides embedded versioning for the α-Flow engine within an α-Doc. Hydra has been implemented as an autonomous component that can also be used by a command-line interface as a stand-alone VCS. The unique functional features of Hydra, 1) multi-module support and 2) validity-awareness, are derived from our use case, yet, the reasoning is of general concern. II. Objectives Healthcare processes are paper-based and there exist logical units (LU) of paper artifacts. Each such set ex- hibits an owner who is at least the contact person or who even takes legal responsibility. In an inter-institutional medical process, several LUs constitute the patient’s case file. Each LU can be considered as a kind of data module. In an electronic document analogon, the version history of each LU must be available independently for data provenance purposes. Thus, each data module requires its independent VCS history, however, the overall team progress, i.e. data production over all data mod- ules, must also remain track-able. This is basically the same situation as in parallel software development with conflicting updates (on medical content files instead of source code files) and with grouping artifacts via LUs. The ‘LU’ is a unified term for ‘data module’ or ‘software module’. The Hydra objective is to provide a generic VCS concept for (1) managing multiple LUs within a single repository. A LU is defined as an arbitrary set of hi- erarchically structured files. Within a single repository, both (i) an independent version history, navigation, and check-out head must be kept for each LU, and (ii) a common version state over all LUs must be provided for module-interdependency maintenance. The second Hydra objective is to allow for (2) labelling versions by a valid/invalid flag and to enable validity-based version navigation. This means to provide both (i) a system path navigation with any and all version states as it is provided by common VCS navigation and (ii) a valid path navigation that operates only on all valid versions. Validity-aware version navigation is important in healthcare because in inter-institutional environments physicians are willing to provide to their peers prelim- inary information. Preliminary versions are invalid in terms of “not signed off” and their content should only be consumed if treated with discreet caution. Validity can imply acceptability instead of formal correctness. The validity facilities of Hydra are also required by the α-Flow protocol for synchronizing distributed α-Doc nodes (cf. [3]). Hydra facilitates the protocol implemen- tation: in case of global conflicts the conflicting versions can simply be marked as invalid. The invalidated ver- sions are required for reconciliation, for which they re- main accessible by Hydra’s system path navigation. Yet, until the conflict is reconciled, the valid path provides the team members with access to the latest globally valid, i.e. conflict free, version. III. Methods The Hydra versioning is inspired by Git (e.g. [5]) and its object model 1 . Hydra also adopts full copy storage 1 e.g. http://eagain.net/articles/git-for-computer-scientists Copyright by IEEE. DOI: 10.1109/XXXXXXXXX The original publication is available at http://ieeexplore.ieee.org/xpls/abs_all.XXXXXXXXXXXx