Knowledge and Information Systems
https://doi.org/10.1007/s10115-018-1304-9
REGULAR PAPER
F2ConText: how to extract holistic contexts of persons of
interest for enhancing exploratory analysis
Md Abdul Kader
1
· Arnold Priguna Boedihardjo
2
· Mahmud Shahriar Hossain
3
Received: 14 February 2017 / Revised: 4 November 2018 / Accepted: 24 November 2018
© Springer-Verlag London Ltd., part of Springer Nature 2018
Abstract
A wide variety of publicly available heterogeneous data has provided us with an opportunity
to meander through contextual snippets relevant to a particular event or persons of interest.
One example of a heterogeneous source is online news articles where both images and text
descriptions may co-exist in documents. Many of the images in a news article may contain
faces of people. Names of many of the faces may not appear in the text. An expert on the topic
may be able to identify people in images or at least recognize the context of the faces who are
not widely known. However, it is difficult as well as expensive to employ topic experts of news
topics to label every face of a massive news archive. In this paper, we describe an approach
named F2ConText that helps analysts build contextual information, e.g., named entity context
and geographical context of facial images found within news articles. Our approach extracts
facial features of the faces detected in the images of publicly available news articles and
learns probabilistic mappings between the features and the contents of the articles in an
unsupervised manner. Afterward, it translates the mappings to geographical distributions
and generates a contextual template for every face detected in the collection. This paper
demonstrates three empirical studies—related to construction of context-based genealogy of
events, tracking of a contextual phenomenon over time, and creation of contextual clusters
of faces—to evaluate the effectiveness of the generated contexts.
Keywords Exploratory analysis · Image-text alignment · Geographical context ·
Information genealogy
B Md Abdul Kader
md.abdul.kader@ibm.com
Arnold Priguna Boedihardjo
Arnold.boedihardjo@radiantsolutions.com
Mahmud Shahriar Hossain
mhossain@utep.edu
1
IBM Innovation Center, Austin, TX 78758, USA
2
Radiant Solutions, Herndon, VA 20171, USA
3
The University of Texas at El Paso, El Paso, TX 79968, USA
123