Multimed Tools Appl
DOI 10.1007/s11042-011-0936-5
An ontology-based evidential framework for video
indexing using high-level multimodal fusion
Rachid Benmokhtar · Benoit Huet
© Springer Science+Business Media, LLC 2011
Abstract This paper deals with information retrieval and semantic indexing of
multimedia documents. We propose a generic scheme combining an ontology-
based evidential framework and high-level multimodal fusion, aimed at recognising
semantic concepts in videos. This work is represented on two stages: First, the
adaptation of evidence theory to neural network, thus giving Neural Network based
on Evidence Theory (NNET). This theory presents two important information for
decision-making compared to the probabilistic methods: belief degree and system
ignorance. The NNET is then improved further by incorporating the relationship
between descriptors and concepts, modeled by a weight vector based on entropy
and perplexity. The combination of this vector with the classifiers outputs, gives us a
new model called Perplexity-based Evidential Neural Network (PENN). Secondly,
an ontology-based concept is introduced via the influence representation of the
relations between concepts and the ontological readjustment of the confidence
values. To represent this relationship, three types of information are computed:
low-level visual descriptors, concept co-occurrence and semantic similarities. The
final system is called Ontological-PENN. A comparison between the main similarity
construction methodologies are proposed. Experimental results using the TRECVid
dataset are presented to support the effectiveness of our scheme.
Keywords Video shots indexing · Semantic gap · Classification · Classifier fusion ·
Inter-concepts similarity · Ontology · LSCOM-lite · TRECVid
R. Benmokhtar (B ) · B. Huet
Département Communications Multimédia, Eurécom, 2229, route des crêtes,
06904 Sophia-Antipolis, France
e-mail: rachid.benmokhtar@eurecom.fr
B. Huet
e-mail: benoit.huet@eurecom.fr