AN OPTIMIZED LABEL-BROADCAST PARALLEL ALGORITHM FOR CONNECTED
COMPONENTS LABELING
Jo˜ ao Marcelo Xavier Nat´ ario Teixeira, Bernardo Reis, Veronica Teichrieb, Judith Kelner
Virtual Reality and Multimedia Research Group, Informatics Center - Federal University of Pernambuco
Av. Prof. Moraes Rego S/N, Pr´ edio da Positiva, 1
o
Andar,
Cidade Universit´ aria, 50732-970, Recife-PE Brasil
{jmxnt, bfrs, vt, jk}@cin.ufpe.br
ABSTRACT
This paper presents a simple and fast algorithm for labeling
connected components in binary images, based on a parallel
label-broadcast paradigm. A grid of processing units (called
spiders) is used and each element is responsible for updat-
ing its label value, during a specific number of iterations.
We describe the design and implementation of an embedded
architecture for real-time labeling of black and white im-
ages based on FPGA technology. Since the image is divided
and processed independently by processing elements, it is
possible to use the proposed algorithm in an FPGA platform
attached to an image sensor and have a focal plane processor
circuit-like.
1. INTRODUCTION
Labeling algorithm is a procedure for assigning a unique la-
bel to each object (a group of connected components) in an
image [1]. This algorithm is used for any subsequent analy-
sis procedure and for distinguishing and referencing the la-
beled objects. Labeling is an indispensable part of nearly all
applications in Pattern Recognition and Computer Vision.
The efficiency of the connected component labeling algo-
rithm is critical for many image processing and machine vi-
sion applications that require real time response. Advances
in the areas of parallel processing and VLSI (Very Large
Scale Integration) technology can be exploited in designing
hardware algorithms for high speed data throughput.
In this work, a fast algorithm for labeling connected com-
ponents in binary images, based on a parallel label-broadcast
paradigm, is proposed. Since the labeling problem posesses
both local and global features, there are many obstacles to
be surpassed so that it enables the creation of a fully paral-
lel approach. In this work, a grid of processing units equal
to the number of pixels on the image is used and each el-
ement is responsible for updating its label value, during a
specific number of iterations. Such label-broadcast model
was previously adopted by [2] and [3], but due to the con-
nectivy scheme and processing, it takes too long to process
a single image. This problem is diminished with our imple-
mentation, since the processing elements (PEs) are directly
connected.
Our processing elements perform exactly the same com-
putation, at the same time. The architecture and PEs have
low complexity and can be implemented, for example, as a
special purpose VLSI chip. The algorithm has a time com-
plexity of 1, with a multiplicative factor of 0.5. Theoreti-
cally, using a clock cycle of 15ns, the proposed hardware
implementation is able to process a 128 × 128 image in
0.122865 milliseconds, using a grid of 128 ×128 processors.
In order to validate the proposed algorithm, we performed
two different implementations in an FPGA platform.
This paper is organized as follows. Section 2 presents
different research related to this work. The proposed algo-
rithm is introduced in Section 3. Its FPGA implementation
is described in Section 4. Strong and weak points regard-
ing the algorithm are stated in Section 5. Finally, Section 6
draws some conclusions and new directions for this work.
2. RELATED WORK
In general, sequential techniques are inefficient in terms of
space and time requirements, whereas parallel algorithms
are based on expensive general purpose parallel computation
models. Sequential labeling approaches have reached their
best performance results with Chang’s and Wu’s algorithms
[1] [4].
Based on the labeling local and global features, different
algorithmic techniques have been proposed to exploit such
properties. Alnuweiri and Prasanna [5] characterize and sur-
vey various parallel architectures and computation models
that implement these techniques.
Crookes and Benkrid [6] describe an architecture based
on a serial, recursive algorithm for labeling. The algorithm
iteratively scans the input image, performing a non-zero max-
imum neighborhood operation, divided in two passes: a for-
ward pass and an inverse one. The main disadvantage of this
technique is that the time to label a whole image depends on
99 978-1-4244-6311-4/10/$26.00 ©2010 IEEE