Data-type Dependent Cache Prefetching for MPEG Applications R. Cucchiara and A. Prati M. Piccardi Dipartimento di Ingegneria dell’Informazione Department of Computer Systems Universit`a di Modena e Reggio Emilia University of Technology Modena, Italy Sydney, Australia Abstract Data cache prefetching is an effective technique to improve performance of cache memories, whenever the prefetching algorithm is able to correctly predict useful data to be prefetched. To this aim, adequate infor- mation on the program’s data locality must be used by the prefetching algorithm. In particular, multimedia applications are characterized by a substantial amount of image and video processing, which exhibits spatial locality in both the dimensions of the 2D data struc- tures used for images and frames. However, in mul- timedia programs many memory references are made also to non-image data, characterized by standard spa- tial locality. In this work, we explore the adoption of different prefetching techniques in dependence of the data type (i.e., image and non-image), thus making it possible to tune the prefetching algorithms to the dif- ferent forms of locality, and achieving overall perfor- mance optimization. In order to prevent interference between the two different data types, a split cache with two separated caches for image and non-image data is also evaluated as an alternative to a standard unified cache. Results on a multimedia workload (MPEG-2 and MPEG-4 decoders) show that standard prefetching techniques such as One-block-lookahead and the Stride Prediction Table are effective for standard data, while novel 2D prefetching techniques perform best on image data. In addition, at a parity of size, unified caches offer in general better performance that split caches, thank to the more flexible allocation of a unified cache space. 1 Introduction Multimedia computing is a field of increasing im- portance in IT, up to the point of strongly influencing the design of modern computers. Actually, multime- dia processing is spreading not only in strictly multi- media applications such as videoconferencing and vir- tual environments, but also in many general-purpose applications added with multimedia capabilities such as, for instance, Internet browsers. MPEG standards played a prime role in the de- velopement of multimedia applications, since they al- lowed interoperability between different platforms and manufacturers of multimedia components. Amongst MPEG standards, the two most important are MPEG- 2 and MPEG-4: MPEG-2 defines compressed video formats over a large variety of frame sizes and rates (covering also the MPEG-1 standard), used massively for movies distribution on the Internet, and for DVDs; the spreading standard MPEG-4 defines instead a more ambitious range of applications including higher compression rates, object-based encoding, interactive multimedia and interactive graphics. MPEG processing consists mainly of image se- quence manipulation; therefore, large image data structures need to be allocated and frequently accessed in MPEG programs. Since images are processed in square pixel blocks (8x8 or 16x16 in size), a relevant 2D spatial locality dominates memory access to image data. Nevertheless, profiling the memory reference traces shows that many other memory accesses are made to non-image data, interleaved with image ac- cesses. Non-image data are mainly generic data struc- tures which undergo standard (1D) spatial locality. As a consequence, data type exhibiting different spatial locality are mixed in MPEG computation. Amongst the many techniques proposed in the lit- erature for optimizing memory performance, cache prefetching has been proven one of the most effective [15][19][18]. Many recent works on cache prefetching address in particular multimedia workloads, due to the increasing spread of multimedia computing [19][20][4], proving efficacy of cache prefetching also in this area. However, none of these proposals has explicitly given evidence to the different form of spatial local- ity dominating image and non-image memory refer- ences. In this paper, we aim to demonstrate that adopting different cache prefetching techniques for dif- ferent data types can result in a (further) performance