138 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 8, NO. 1, FEBRUARY 2012
A Fusion Approach for Efficient
Human Skin Detection
Wei Ren Tan, Chee Seng Chan, Member, IEEE, Pratheepan Yogarajah, and Joan Condell
Abstract—A reliable human skin detection method that is adapt-
able to different human skin colors and illumination conditions is
essential for better human skin segmentation. Even though dif-
ferent human skin-color detection solutions have been successfully
applied, they are prone to false skin detection and are not able to
cope with the variety of human skin colors across different ethnic.
Moreover, existing methods require high computational cost. In
this paper, we propose a novel human skin detection approach
that combines a smoothed 2-D histogram and Gaussian model,
for automatic human skin detection in color image(s). In our
approach, an eye detector is used to refine the skin model for a
specific person. The proposed approach reduces computational
costs as no training is required, and it improves the accuracy of
skin detection despite wide variation in ethnicity and illumination.
To the best of our knowledge, this is the first method to employ
fusion strategy for this purpose. Qualitative and quantitative
results on three standard public datasets and a comparison with
state-of-the-art methods have shown the effectiveness and robust-
ness of the proposed approach.
Index Terms—Color space, dynamic threshold, fusion strategy,
skin detection.
I. INTRODUCTION
W
ITH the progress of information society today, images
have become more and more important. Among them,
skin detection plays an important role in a wide range of image
processing applications from face tracking, gesture analysis,
content-based image retrieval systems to various human–com-
puter interaction domains [1]–[6]. In these applications, the
search space for objects of interests, such as hands, can be re-
duced through the detection of skin regions. One of the simplest
and commonly used human skin detection methods is to define
a fixed decision boundary for different color space components
[7]–[9]. Single or multiple ranges of threshold values for each
color space components are defined and the image pixel values
Manuscript received March 31, 2011; revised July 20, 2011, August 26,
2011; accepted September 13, 2011. Date of publication October 18, 2011;
date of current version January 20, 2012. This work was supported by the
University of Malaya HIR under Grant UM.C/625/1/HIR/037, J0000073579.
Paper no. TII-11-181.
W. R. Tan and C. S. Chan are with the Centre of Image and Signal Pro-
cessing, Faculty of Computer Science and Information Technology, University
of Malaya, 50603 Kuala Lumpur, Malaysia (e-mail: willtwr@siswa.um.edu.my;
cs.chan@um.edu.my).
P. Yogarajah and J. Condell are with the School of Computing and Intelli-
gent Systems, University of Ulster (Magee), Northern Ireland, BT48 7JL, U.K.
(e-mail: p.yogarajah@ulster.ac.uk; J.Condell@ulster.ac.uk).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TII.2011.2172451
that fall within these predefined range(s) are selected as skin
pixels. In this approach, for any given color space, skin color
occupies a part of such a space, which might be a compact or
large region in the space. Other approaches are multilayer per-
ceptron [10]–[12], Bayesian classifiers [13]–[15], and random
forest [16]. These aforementioned solutions that use single fea-
tures, although, successfully applied to human skin detection;
they still suffer from the following. 1) Low Accuracy: False skin
detection is a common problem when there is a wide variety of
skin colors across different ethnicity, complex backgrounds and
high illumination in image(s). 2) Luminance-invariant space:
Some robustness may be achieved via the use of luminance
invariant color space [1], [17]; however, such an approach can
withstand only changes that skin-color distribution undergo
within a narrow set of conditions and also degrades the per-
formance [18]. 3) Require large training sample: In order to
define threshold value(s) for detecting human skin, most of the
state-of-the-art work requires a training stage. One must under-
stand that there are tradeoffs between the size of the training set
and classifier performance. For example, Jones and Rehg [15]
required 2 billion pixels collected from 18 696 web images to
achieve optimal performance. In this paper, we propose a novel
approach—fusion framework—that uses product rules on two
features; the smoothed 2-D histogram and Gaussian model
to perform automatic skin detection. First of all, we employ
an online dynamic approach as in [19] to calculate the skin
threshold value(s). Therefore, our proposed method does not
require any training stage beforehand. Second, a 2-D histogram
with smoothed densities and a Gaussian model are used to
model the skin and nonskin distributions, respectively. Finally,
a fusion strategy framework using the product of two features
is employed to perform automatic skin detection. To the best of
our knowledge, this is the first attempt that employs a fusion
strategy to detect skin in color image(s). The image pixels
representation in a suitable color space is the primary step in
skin segmentation in color images. A better survey of different
color spaces (e.g., RGB, YCbCr, HSV, CIE Lab, CIE Luv, and
normalized RGB) for skin-color representation and skin-pixel
segmentation methods is given by Kakumanu et al. [20].
In our approach, we do not employ the luminance-invariant
space. Indeed, we choose the log opponent chromaticity (LO)
space [21]. The reasons are twofold: first, color opponency
is perceptually relevant as it has been proven that the human
visual system uses an opponent color encoding [22], [23]; and
second, in this LO color space, the use of logarithms renders
illumination change to a simple translation of coordinates.
Most of the aforementioned solutions claimed that illumination
variation is one of the contributing factors that degrade the
performance of skin detection systems. However, our empirical
1551-3203/$26.00 © 2011 IEEE