computer methods and programs in biomedicine 82 ( 2 0 0 6 ) 73–80
journal homepage: www.intl.elsevierhealth.com/journals/cmpb
A neural approach to extract foreground from human
movement images
S. Conforto
∗
, M. Schmid, A. Neri, T. D’Alessio
Dept. of Applied Electronics, University Roma TRE, Via della Vasca Navale 84, I-00146 Roma, Italy
article info
Article history:
Received 22 October 2004
Received in revised form 9 February
2006
Accepted 10 February 2006
Keywords:
Segmentation
Human movement analysis
Quality assessment
Neural networks
abstract
In recent years many approaches to foreground extraction from images related to human
movement have been presented. The foreground extraction represents a pre-processing pro-
cedure to be implemented in a system for capturing human movement in order to facilitate
the tracking of anatomical landmarks on human bodies. In this work, an approach based
on an unsupervised neural network has been studied: a Kohonen map has been designed to
recognize and separate structures characterizing foreground and background. The proposed
technique is fully automatic and its performance has been compared with those of two fur-
ther approaches based on differences between foreground and background images. In order
to quantify the segmentation quality, an already validated, objective, and automatic crite-
rion has been used. The obtained results are adequate with the final aim of the application
and show the feasibility of the proposed approach.
© 2006 Elsevier Ireland Ltd. All rights reserved.
1. Introduction
The capture of human movement is a process dealing with the
large scale movements of a subject at different resolution (that
is, entire body, limbs, single fingers). This is a hot topic for sev-
eral applications, such as surveillance, control, and analysis
[1]: the first deals with monitoring movement in order to inter-
pret and classify the subjects’ actions; the second exploits the
results of the movement capture to implement control func-
tionalities; the latter regards the analysis of captured data, to
be used either for sport performance assessment or for diag-
nostic and rehabilitation purposes.
From a functional point of view, a system for the capture of
movement can be subdivided into four logical blocks: initial-
ization, tracking, pose estimation, and interpretation. The first
is related to all the procedures needed to correctly interpret
the scene wherein the subject moves; the second is respon-
sible for the tracking procedure; the third uses the tracking
results to determine the pose of body segments over time; the
∗
Corresponding author. Fax: +39 06 55177026.
E-mail address: conforto@uniroma3.it (S. Conforto).
final goal of the system is to interpret the global action fulfilled
by the movement.
The efficiency of tracking can be improved by the use of pre-
processing techniques, such those allowing the extraction of
regions of interest from video sequences, generally referred to
as image segmentation [2]. This low-level process deals with
the subdivision of a scene into regions of interest on the basis
of different coherence criteria such as color [3], texture [4],
edge [5], or any combination of these ones.
In the framework of human movement analysis, the seg-
mentation process often consists of separating the mov-
ing subject (i.e. foreground), from the background. The fore-
ground/background separation can be achieved by techniques
based on either temporal or spatial data.
Temporal data can be used in two different ways, subtrac-
tion and flow. Subtraction techniques perform the segmenta-
tion by processing one or more inter-frame differences [6];a
particularly simple situation occurs when in at least one frame
only the background scene is present [7,8]. The optical flow
0169-2607/$ – see front matter © 2006 Elsevier Ireland Ltd. All rights reserved.
doi:10.1016/j.cmpb.2006.02.005