Access Time Modeling of a MLR1 Tape Drive Olav Sandst˚ a, Thomas Maukon Andersen, Roger Midtstraum, and Rune Sætre Department of Computer and Information Science Norwegian University of Science and Technology N-7034 Trondheim, Norway olavsa, maukon, roger, runes @idi.ntnu.no Abstract Traditionally, digital tape has been used by applications accessing the tape mostly sequentially. Ap- plications having a random access pattern to the data are much better off storing the data on magnetic disks. Today we see new application areas, where the need for huge amounts of digital storage does not make it cost-effective to store all the data on magnetic disk. One example of such an application is digital video archives. In this paper, we present an access model for a Tandberg MLR1 serpentine tape drive. For such a drive, there is no direct relation between the logical block address and the correspond- ing physical position on the tape. This makes it difficult to optimize the usage of the tape drive when we have concurrent accesses to the tape. The paper presents a generic model of a serpentine tape drive to- gether with several algorithms for characterizing individual tapes to improve the accuracy of the model. Based on measurements using the MLR1 tape drive the generic model is improved. This extended model makes it possible to estimate seek times with an average error of about two seconds. 1 Introduction A few decades ago when the blue dinosaurs still ruled the computer rooms, digital tape was used actively in the data processing. Input to the processing (program and data) and the results from the processing were stored on tape. The use of the tape was highly optimized. Most of the processing was done as batch jobs where the tape was read and written sequentially. Since then, magnetic disks have taken over as the primary storage medium for program and data. The main reason is the much better support for random access that a disk can provide. Compared to magnetic disk, digital tape has a random access latency which is 3 to 4 orders of magnitude higher — tens of seconds versus ten milliseconds. As a result, today digital tape is mostly used as a storage medium for backup and distribution, and in applications which need to store really huge amounts of measurement data. One example is NASA’s EOSDIS project which will generate more than a terabyte of earth science data per day [7]. The main advantages of digital tape compared to magnetic disks are cost and storage density. For mag- netic disks, the cost of the medium is about 1 NOK per MB, while for digital tape, the cost of the medium is about 0.05 NOK per MB. Today, a high-end disk drive stores about 10–20 GB. A digital tape stores the same amount of data on a fraction of the price and storage volume. For new applications like data mining, digi- tal image databases, and video archives, which will require huge amount of data storage, these two factors will be important. Many of these applications will not be able to justify the cost of magnetic disks, making digital tape an alternative. These applications will have a more random access pattern to the tapes than traditional tape applications have. To minimize the access delay to the tape, and to maximize the utilization of the tape drive, it will be necessary to optimize accesses to the tapes. In this paper we present a model for a digital MLR1 serpentine tape. The purpose of making this model is to be able to optimize concurrent accesses to the tape. The model provides access time estimates which should be used by tape schedulers and other applications accessing the tape. Just as a disk scheduler needs a detailed model of the disk to be able to rearrange the accesses to the disk, a tape scheduler needs an accurate model of the relations between the logical blocks and the physical 267