Seeing Through the Window: Pre-fetching Strategies for Out-of-core Image Processing Algorithms Pinho R., Batenburg K. J. and Sijbers J. VisionLab, Physics Dept., University of Antwerp, Antwerp, Belgium ABSTRACT Scientific data files have been increasing in size during the past decades. In the medical field, for instance, magnetic resonance imaging and computer aided tomography can yield image volumes of several gigabytes. While secondary storage (hard disks) increases in capacity and its cost per megabyte slumps over the years, primary memory (RAM) can still be a bottleneck in the processing of huge amounts of data. This represents a problem for image processing algorithms, which often need to keep in memory the original image and a copy of it to store the results. Operating systems optimize memory usage with memory paging and enhanced I/O operations. Although image processing algorithms usually work on neighbouring areas of a pixel, they follow pre-determined paths through the image and might not benefit from the memory paging strategies offered by the operating system, which are general purpose and unidimensional. Having the principles of locality and pre- determined traversal paths in mind, we developed an algorithm that uses multi-threaded pre-fetching of data to build a disk cache in memory. Using the concept of a window that slides over the data, we predict the next block of memory to be read according to the path followed by the algorithm and asynchronously pre-fetch such block before it is actually requested. While other out-of-core techniques reorganize the original file in order to optimize reading, we work directly on the original file. We demonstrate our approach in different applications, each with its own traversal strategy and sliding window structure. Keywords: Out-of-core image processing, Pre-fetching. 1. INTRODUCTION A typical scenario in image processing applications is a pipeline of processing units. The first unit receives as input the original image buffer, generates the result, possibly in a new buffer, and passes it on to the next unit. The second unit receives the new buffer, processes it, and hands it over to the next. This process is repeated until the last unit produces the final image, as depicted in Figure 1. Figure 1. Image processing pipeline. Considering such a pipeline, image processing algorithms often require that the original image and one or several copies of it be kept in memory. A problem may thus arise if the processed image buffer is too big to fit in memory. In practice, with the constant increase in the size of scientific data files, processing of large amounts of data has become commonplace. In the medical field, for instance, magnetic resonance imaging and computer aided tomography can yield image volumes of several gigabytes. Even in an optimised processing pipeline which eliminates intermediate image copies, the usual is to have the original image and a copy of it in memory in order to store the final result. Off-the-shelf applications 1 and implementation libraries 2, 3 commonly used in medical imaging processing and visualisation also try to allocate the entire image volume in main memory. When they do not manage to do so, alternative image representations are offered, eg at lower scale, or they simply refuse to allocate the requested block and return an error to the user. E-mail: {romulo.pinho, joost.batenburg, jan.sijbers}@ua.ac.be