Improved Semantic-based Human Interaction
Understanding Using Context-based knowledge
Kamrad Khoshhal Roudposhti
*
, Jorge Dias
*,**
* Institute of Systems and Robotics, Department of Electrical and Computer Engineering, University of Coimbra,
Portugal.
** Khalifa University, UAE
{kamrad,jorge@isr.uc.pt}
Abstract—This paper proposes a descriptive approach for
context-based human activity analysis through an hierarchical
framework in a scene understanding application. Each human
movement with respect to himself, others and scene, can arise
different layers of human activities analysis, which usually inves-
tigated separately depend on the application. Human behaviour
can not be analysed properly, since the all different layers
of information were not considered. The effect of using the
different layers of information to increase the accuracy of the
analysis is presented in the study. The contributions are, using
different information layers such as human body parts movement
and human-object interaction, in 3D space, to improve human
activity analysis, and proposing a probabilistic and descriptive
model, based on a well-known human movement descriptor and
Bayesian Network (BN) approach. Thus, based on the mentioned
framework, the model is generalizable and flexible which are
necessary for having such an applicable system. The capability
of the proposed approach is presented in the experiment’s section.
Index Terms—Scene understanding, hierarchical framework,
human interaction analysis,Bayesian approach, human movement
analysis, descriptive model.
I. I NTRODUCTION
This paper proposes a flexible scene understanding model,
which can describe human activity based on a well-known de-
scriptor, and deal with uncertainly using probabilistic models.
Human activity analysis can be categorized as context-free and
context-based. In context-free based approaches the model is
independent of scene parameters, and just rely on the features
belong to the person. However in the reality, context-based
features play very important role to analyse human activities.
For instance, when a person going to reach a chair, we will
realize that properly the person going to sit on the chair, not
to sleep.
As Delaitre et al, described in [6], since object detection is a
widely studied topic in computer vision, analysing the relation
between human movements and the existent object around, can
produce valuable information for human daily activities. For
instance, people have been learned the (most probable) normal
activities when the person is reaching to a chair, thus people
have a probabilities set of activities depend on the objects in
the scene.
The problems is, what level of human movements infor-
mation might be useful, and then how a general framework
can be defined for analysing any possibility of human-object
interactions. For the mentioned aspect, from the low level
information such as body parts motions to higher ones such
as human interactions can be useful. Dealing with the men-
tioned different information caused a complex model. Thus,
an hierarchical framework was used to reduce the complexity
of the model [1] to provide different level of human activity
analysis [11].
The relationship probability distributions between human
motions and human-object based information, can be mod-
elled, by given the possible activities and the interested objects
in a scene. Laban Movement Analysis (LMA) system which
consists of several components, is used to define proper human
motions (Effort, Shape) [13], [12]and human-scene relations
(Relationship) [16], [10] variables. Gupta et al. in [9] tackled
the problem based on the 2D images. Thus they focused
more on the computer vision problems for the mentioned
applications, and just used the person hand trajectory infor-
mation to analyse human-object interactions (reaching and
manipulation). Their mentioned Bayesian model can not deal
easily with the extension of the work for other activities.
Thus we proposed the hierarchical model to deal with the
problem, and to avoid the limitation of the 2D-based analysis,
we used a motion tracker suit (MVN
®
) with several inertial
sensor attached on the different body parts to have 3D pose
of human body parts with maximum 120 frames per second
resolution. However there are several works using 3D-based
human movement analysis with high accuracy [14], [4], and
also in 3D virtual applications [7], but only focused on
classifying simple human movements.
This paper is organized as following; Sec. II presents
the feature extraction methods, and then based on that, the
hierarchy-based human activity modelling is presented in Sec.
III. Experimental results presented and discussed in Sec. IV,
and Sec. V closes the paper with a conclusion and an outlook
for future works.
II. FEATURE CATEGORIZATION AND EXTRACTION USING
LMA
Body parts trajectories during human activities and the
relationship between human and interested objects in the
scene, are the input data of this study. A motion tracker
suit is used to obtain the 3D human body parts positions
2013 IEEE International Conference on Systems, Man, and Cybernetics
978-1-4799-0652-9/13 $31.00 © 2013 IEEE
DOI
2905
2013 IEEE International Conference on Systems, Man, and Cybernetics
978-1-4799-0652-9/13 $31.00 © 2013 IEEE
DOI
2905
2013 IEEE International Conference on Systems, Man, and Cybernetics
978-1-4799-0652-9/13 $31.00 © 2013 IEEE
DOI 10.1109/SMC.2013.494
2899