A PARALLEL IMAGE SEGMENTATION ALGORITHM ON GPUS P. N. Happ a,* , R. Q. Feitosa a , C. Bentes b , R. Farias c a Department of Electrical Engineering, Pontifical Catholic University of Rio de Janeiro Rua Marquês de São Vicente 225, Gávea, CEP 22451-900, Rio de Janeiro, RJ, Brazil {patrick, raul}@ele.puc-rio.br b Dept. of Computer and Systems, Rio de Janeiro State University Rua São Francisco Xavier 524, Maracanã, CEP 20550-900, Rio de Janeiro, RJ, Brazil cris@eng.uerj.br c Federal University of Rio de Janeiro P.O. Box: 6851, CEP 21945-970, Rio de Janeiro, RJ, Brazil rfarias@cos.ufrj.br KEY WORDS: Image Segmentation, Parallel Processing, GPU ABSTRACT: Image segmentation is a computationally expensive task that continuously presents performance challenges due to the increasing volume of available high resolution remote sensing images. Nowadays, Graphics Processing Units (GPUs) are emerging as an attractive computing platform for general purpose computations due to their extremely high floating-point processing performance and their comparatively low cost. In the image analysis context, the use of GPUs can accelerate the segmentation process. This work presents a parallel implementation of a region growing algorithm for GPUs. The parallel algorithm is based on processing each pixel as a different thread so as to take advantage of the fine-grain parallel capability of the GPU. In addition to the parallel algorithm, the paper also suggests a modification to the heterogeneity computation that improves the segmentation performance. The experiments results demonstrate that the parallel algorithm achieve significant performance gains, running up to 6.8 times faster than the sequential approach. * Corresponding author. INTRODUCTION Image segmentation has been the subject of extensive research in the areas of digital image processing and computational vision. The segmentation process plays a key role in the image analysis process (Blaschke and Strobl, 2001), and many segmentation methods have been proposed in the literature (Riseman and Arbib, 1977; Fu and Mui, 1981; Haralick and Shapiro, 1985; Pal and Pal, 1993; Deb, 2008) together with metrics for quality assessment (Zhang, 1996; Correa and Pereira, 2000; Cardoso and Corte-Real, 2005; Zhang et al., 2008). Among the image segmentation methods, the region growing algorithm is one of the best known and the most widely used in the remote sensing area (Tilton and Lawrence, 2000). Region growing algorithms group pixels or sub-regions in larger regions on an iterative way. The process starts with a set of initial points, called seeds, that grows by merging adjacent regions that contains similar properties such texture or color. However, this segmentation technique is computationally expensive when large images are considered (Wassenberg et al., 2009). In addition, region growing usually has some parameters that must be adjusted for each type of application, which implies in a number of executions until the optimal parameter values are found. Thus, the execution time of the segmentation is decisive for its operational use in automatic image interpretation systems. For this reason, computational acceleration is highly required. Recent advances in the hardware architecture and programmability of Graphics Processing Units (GPUs) have turned them into an attractive platform for accelerating general purpose floating-point computations. They offer promising speedups, are available off-the-shelf, and it is likely that most computers will be equipped with such devices in the future. Modern GPUs can achieve performance of at least one order of magnitude higher compared to that of the traditional CPUs. However, the problem is how to program these devices efficiently. Parallelizing the algorithm to fit the highly parallel architecture of the GPU can be a challenging task. Several GPU implementations of image segmentation methods have been proposed in the literature. Some of them were built on the facility of implementing the evaluation of partial differential equations in a stream processing model (Sherbondy et al., 2003; Lefohn et al., 2003). There are also some research efforts in the area of medical imaging (Ruiz et al., 2008; Erdt et al., 2008; Ahn et al., 2005; Unger et al., 2008; Pan et al., 2008). The particular case of satellites images has to be pointed out. Sun et al. (2009) implemented a parallel segmentation method in GPU for remote sensing images based on the clustering Mean Shift algorithm. Their approach starts from selected seeds and clusters the pixels near the seeds. The center of each cluster is computed and the regions grow from these centers. This two step method implies in a pixel independent parallel implementation that provided a speedup around 20 for IKONOS and Quickbird images. Nevertheless, as far as we know, there is no GPU implementation of unseed region growing algorithm. Proceedings of the 4th GEOBIA, May 7-9, 2012 - Rio de Janeiro - Brazil. p.580 580