The Prickly Pear Archive: A Portable Hypermedia for Scholarly Publication Dennis G. Castleberry Department of Computer Science Center of Computation and Technology Louisiana State University Baton Rouge, LA 70803, USA dcastl2@cct.lsu.edu Steven R. Brandt Department of Computer Science Center of Computation and Technology Louisiana State University Baton Rouge, LA 70803, USA sbrandt@cct.lsu.edu Frank Löffler Center of Computation and Technology Louisiana State University Baton Rouge, LA 70803, USA knarf@cct.lsu.edu Hari Krishnan Department of Computer Science Center of Computation and Technology Louisiana State University Baton Rouge, LA 70803, USA hkrish4@cct.lsu.edu ABSTRACT An executable paper is a hypermedia for publishing, review- ing, and reading scholarly papers which include a complete HPC software development or scientific code. A hyperme- dia is an integrated interface to multimedia including text, figures, video, and executables, on a subject of interest. Re- sults within the executable paper include numeric output, graphs, charts, tables, equations and the underlying codes which generated such results. These results are dynamically regenerated and included in the paper upon recompilation and re-execution of the code. This enables a scientifically enriched environment which functions not only as a journal but as a laboratory in itself, in which readers and reviewers may interact with and validate the results. The Prickly Pear Archive (PPA) is such a system [2]. One distinguishing feature of the PPA is the inclusion of an un- derlying component-based simulation framework, Cactus [8], which simplifies the process of composing, compiling, and executing simulation codes. Code creation is simplified us- ing common bits of infrastructure; each paper augments to the functionality of the framework. New distinguishing fea- tures include the (1) portability and (2) reproducibility of the archive, which allow researchers to move and re-create the software environment in which the simulation code was created. Further, the (3) Piraha parser is now used to match complex multi-line expressions inside parameter and L A T E X files. Finally, (4) an altogether new web interface has been created. The new interface options closely mirror the direc- tory structure within the paper itself, which gives the reader a transparent view of the paper. Thus, once accustomed to reading from the archive, assembling a paper package be- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. XSEDE12, July 16 - 20 2012, Chicago, Illinois, USA Copyright 2012 ACM 978-1-4503-1602-6/12/07 ...$15.00. comes a straightforward and intuitive process. A PPA production system hosted on HPC resources (e.g. an XSEDE machine) unifies the computational scientific pro- cess with the publication process. A researcher may use the production archive to test simulations; and upon arriving at a scientifically meaningful result, the user may then incor- porate the result in an executable paper on the very same resource the simulation was conducted. Housed within a vir- tual machine, the PPA allows multiple accounts within the same production archive, enabling users across campuses to bridge their efforts in developing scientific codes. Categories and Subject Descriptors D.2.6 [Software]: Software Engineering—Interactive Envi- ronments ; H.5.4 [Information Interfaces and Presenta- tion]: Hypertext/Hypermedia; H.2.8 [Information Inter- faces and Presentation]: Database Applications—Scien- tific Databases 1. INTRODUCTION An executable paper incorporates into its markup code references to manipulable parameters, perhaps including sym- bolic equations, which the simulation executable accepts as input. An interface to this markup code enables authors, readers, and reviewers to control these parameters and re- generate the paper, potentially arriving at a novel result. Thus, the executable paper functions not only as a publica- tion in itself, but also as an interactive lab from which novel science may be extracted. When encapsulated within a vir- tual machine, it functions as a portable laboratory which other scientists may copy and re-use. The notion of an ex- ecutable paper is particularly useful in the context of com- puter and computational science, where the code underlying a (scientific) software development is of interest to the wider development community. Why are executable papers to be preferred over traditional papers? As Gavish et al. have observed [7], the current work- flow in the life cycle of a traditional paper may be summa- rized in the following five steps: