SEGMENT-BASED MOTION ESTIMATION USING A BLOCK-BASED ENGINE Patrick Meuwissen 1 , Ramanathan Sethuraman 1 , Fabian Ernst 1 , Harm Peters 1 and Rafael Peset Llopis 2 1 Philips Research Laboratories, Prof. Holstlaan 4 (WDC-31), 5656 AA Eindhoven, the Netherlands 2 Philips Consumer Electronics, P.O. Box 80002 (SFJ-644), 5600 JB Eindhoven, the Netherlands patrick.meuwissen, ramanathan.sethuraman, fabian.ernst, harm.peters, rafael.peset.llopis @philips.com phone: +31-40-2744523; fax: +31-40-2744639 ABSTRACT Motion estimation is a key function in scan rate conversion, ad- vanced picture quality improvement, 2D-to-3D content conversion, and many other video processing steps. For hardware efﬁciency reasons, most motion estimation implementations are block-based. As object boundaries commonly do not coincide with block bound- aries, artifacts may be visible at object boundaries using the block- based approach. Motion estimation for irregular shapes, such as image segments, can accurately track motion boundaries, but a straightforward translation of block-based motion estimation algo- rithms to segment-based ones leads to inefﬁcient hardware imple- mentations. Therefore, this paper proposes a modiﬁed segment- based motion estimation algorithm utilizing the efﬁciency of block- based processing. We demonstrates an efﬁcient very large in- struction word (VLIW) application-speciﬁc instruction-set proces- sor (ASIP) implementation of this algorithm. 1. INTRODUCTION Motion estimation is a key technology underlying many video pro- cessing applications such as video coding [1,3], scan-rate upconver- sion [2], motion-compensated deinterlacing [4] and 2D-to-3D video conversion [5, 6]. Block-based motion estimation algorithms, al- though suffering from only block-accurate motion vectors, are pop- ular due to ease of implementation (for achieving real-time perfor- mance) [3]. On the other hand, image segmentation is seen as a vital ingredient for content-based video processing, for instance in the domain of content-based retrieval (e.g. MPEG7, [1]), object tracking [7] and 2D-to-3D video conversion [6]. Current coding standards such as H.264 [8] use variable block sizes and shapes to, amongst others, reduce block-related artifacts; this can be consid- ered as an intermediate between block-based approaches and full segment-based approaches. Segment-based motion estimation (SBME) replaces the ﬁxed blocks of traditional motion estimation algorithms with segments having arbitrary shapes and sizes. As motion vectors are now as- signed to segments instead of blocks, it allows for motion disconti- nuities at their true, pixel-accurate, positions in the image (see Fig- ure 2). Segment-based approaches (e.g. [9]) have shown the poten- tial for highly accurate motions in a benchmark test [10]. Further- more segments, as content-dependent entities, can be tracked over multiple frames of a video sequence. This functionality, that blocks can not provide, is useful for temporal ﬁltering or other multi-frame processing. However, from a hardware implementation point of view, SBME has a signiﬁcant disadvantage: Since segments can be of ar- bitrary shape and size, a straightforward implementation of a SBME algorithm will either suffer from inefﬁcient use of data memory bandwidth or suffer from irregular data addressing, which also re- sults in poor bandwidth utilization of modern memories that are optimized for burst accesses (e.g. SDRAMs). Figure 1 illustrates this in detail. This disadvantage often precludes the use of SBME in real-time video applications. Here, efﬁcient memory bandwidth uti- lization is a must because the size of frame buffers typically requires the use of off-chip memories, which have a limited data bandwidth compared to the processing capabilities (and thus the data band- width requirements) of logic chips. In this paper, we propose a SBME algorithm that applies block-based processing to calculate segment-based motion vectors. Thus, this modiﬁed SBME algo- rithm achieves an efﬁcient use of data memory bandwidth, without sacriﬁcing the regularity of segment data addresses. Further, the modiﬁed algorithm exhibits massive parallelism which also facil- itates real-time implementations. The parallelism and the block- based memory addressing are exploited by our VLIW ASIP imple- mentation, which has several Application Speciﬁc Units (ASUs) for accelerating inner kernels of the algorithm in a SIMD-style fashion and for buffering blocks of data that are used multiple times, thus reducing the bandwidth requirements of the data memory. For SBME, we only require from the segmentation that no mo- tion discontinuity occurs inside a segment [5, 6]. As the motion is determined on a segment basis, this is a chicken-and-egg problem. However, a color or luminance segmentation (see Figure 2) in gen- eral fulﬁlls this requirement. A vast amount of algorithms exist for color segmentation (e.g. [11, 12]). In this paper, we focus on the SBME algorithm itself, assuming that an appropriate segmentation is provided by any (e.g. one of the above) segmentation algorithm. 1 2 3 4 5 Figure 1: Illustration of the inefﬁciency of SBME algorithm: If all blocks within the bounding box are fetched for each segment (green and red blocks for segment 1), this results in simple segment data addresses for fetching the required blocks (and pixels) of the segment; however, fetching of non-segment blocks (red blocks for segment 1) results in inefﬁcient use of the data memory bandwidth. Alternatively, if only those blocks that are part of a segment are fetched (green blocks for segment 1) irregularity of the segment data addresses leads to computational overhead as well as inefﬁcient use of data memory bandwidth. Figure 2: Left: Block-grid overlay of a frame in Renata video se- quence. Right: Segmentation of the same frame. If segments in- stead of blocks are used for motion estimation, motion boundaries can be obtained with pixel accuracy.