24/7 Characterization of Petascale I/O Workloads Philip Carns, Robert Latham, Robert Ross, Kamil Iskra, Samuel Lang Mathematics and Computer Science Division Argonne National Laboratory Argonne, IL 60439 {carns,robl,rross,iskra,slang}@mcs.anl.gov Katherine Riley Argonne Leadership Computing Facility Argonne National Laboratory Argonne, IL 60439 riley@mcs.anl.gov Abstract—Developing and tuning computational science ap- plications to run on extreme scale systems are increasingly complicated processes. Challenges such as managing memory access and tuning message-passing behavior are made easier by tools designed specifically to aid in these processes. Tools that can help users better understand the behavior of their application with respect to I/O have not yet reached the level of utility necessary to play a central role in application development and tuning. This deficiency in the tool set means that we have a poor understanding of how specific applications interact with storage. Worse, the community has little knowledge of what sorts of access patterns are common in today’s applications, leading to confusion in the storage research community as to the pressing needs of the computational science community. This paper describes the Darshan I/O characterization tool. Darshan is designed to capture an accurate picture of appli- cation I/O behavior, including properties such as patterns of access within files, with the minimum possible overhead. This characterization can shed important light on the I/O behavior of applications at extreme scale. Darshan also can enable re- searchers to gain greater insight into the overall patterns of access exhibited by such applications, helping the storage community to understand how to best serve current computational science applications and better predict the needs of future applications. In this work we demonstrate Darshan’s ability to characterize the I/O behavior of four scientific applications and show that it induces negligible overhead for I/O intensive jobs with as many as 65,536 processes. I. I NTRODUCTION Efficient use of extreme-scale computing resources often requires extensive application tuning. To tune applications most effectively, application developers need to be able to observe the behavior of applications before and after changes are made, so that they can assess the impact of tuning efforts. In the areas of memory and communication subsystem be- havior, many tools are available that provide insight into how the application is interacting with the subsystem [1], [2], [3], [4]. These utilities play an important role in the performance tuning of applications at extreme scale. Unfortunately, similar tools are not available for I/O. Ex- isting I/O tools typically fall into two categories. The first category relies on tracing and logging each individual I/O operation, a task that becomes increasingly expensive at larger scale. These tools often lack postprocessing capabilities nec- essary to identify salient characteristics. The second category relies on profiling and sampling in order to reduce overhead, but in doing so these tools sacrifice detail about the access patterns being generated. Just as successful tools for memory and communication characterization have been tailored to high-perfromance computing (HPC) demands and patterns, tools for extreme scale I/O characterization must be crafted to capture an appropriate level of detail with minimal impact on behavior. Additionally, there exists an overall lack of understanding of how today’s computational science applications interact with the storage system. This lack of understanding has created confusion in the storage research community as to how to best focus effort. If analysis tools for application I/O incurred little overhead, these tools could be enabled for all application runs. Doing so would allow us to identify trends in applications, help us understand successful I/O strategies, and inform the storage research community as to the needs of computational science. In order to fit these two roles, a parallel I/O workload characterization tool must meet the following goals: Reflection of application-level behavior Transparency to users Leadership-class scalability Reflection of application-level behavior. A 24/7 character- ization tool should capture application-level behavior in order to distinguish between jobs and identify how each one interacts with the storage system. This is a straightforward goal with subtle implications. Most HPC job workloads include a mixture of MPI-IO and traditional POSIX interface usage. Both should be captured in order to accurately represent all applications. File-system- level instrumentation is insufficient because MPI-IO or I/O forwarding may have already transformed the access pattern expressed by the application before it reaches the file system. Tying characterization to a specific file system or storage device limits portability as well. Transparency to users. A characterization tool must be transparent to end users in order to be suitable for long- term deployment for all applications. Any burden on users or administrators acts as a barrier to participation, particularly if the goal is to characterize the entire system. Even more important, a characterization tool must minimize resource usage to the point that it has negligible impact on applica- tion performance. Performance is the foremost priority of a leadership computing platform and cannot be compromised by full-time characterization. Further, the characterization tool 978-1-4244-5012-1/09/$25.00 ©2009 IEEE