JULY/AUGUST 2008 1541-1672/08/$25.00 © 2008 IEEE 51
Published by the IEEE Computer Society
C o m p u t a t i o n a l C u l t u r a l D y n a m i c s
C ONVEX : Similarity-
Based Algorithms
for Forecasting
Group Behavior
Vanina Martinez, Gerardo I. Simari, Amy Sliva, and V.S. Subrahmanian,
University of Maryland College Park
Two classes
of algorithms based
on vector similarity
predict a group’s
behavior. These
algorithms are
extremely fast
and highly accurate.
M
any applications could beneft from accurately predicting an entity’s behavior.
For example, researchers have developed methods to predict a terrorist orga-
nization’s probable actions (such as bombings or kidnappings).
1,2
Likewise, we might be
interested in predicting whether a government will raise taxes of various types (property
taxes, income taxes, and so on). Most of these pre-
dictions are based on indicators correlated with the
actions being predicted.
However, in many cases we don’t know these
indicators a priori. For example, consider the Mi-
norities at Risk Organizational Behavior (MAROB)
research project.
3
The MAROB data set consists of
information about ethnopolitical groups at risk of
falling into terrorism or already engaged in it. MA-
ROB has identifed about 284 variables to be tracked
on a yearly basis for each of more than 280 eth-
nopolitical groups worldwide. A handful of these
variables represent actions the group has taken;
the others represent the environment or context in
which the group functions. The ontology generated
is quite shallow in contrast to the deep ontologies in
Semantic Web approaches such as RDF or OWL.
We’re interested in using data on past group be-
havior—particularly data related to the context and
situation—to predict the group’s actions. To make
such predictions, we developed a computational
theory and the CONVEX families of algorithms.
Compared to previous prediction approaches (see
the “Related Work in Predicting Group Behavior”
sidebar for some examples), these vector-based al-
gorithms are extremely fast. They’re also accurate,
and their results are easily explainable.
A formal vector
model of group behavior
Suppose we have data covering a group’s behavior
over many years. Each past behavior is a pair of two
vectors. The context vector contains the values of
the context variables associated with the group, in-
cluding information about actions taken by other
organizations that affect the group. The action vec-
tor contains the values of the action variables asso-
ciated with that group, refecting actions the group
has engaged in.
Suppose that a user wants to identify what the
group might do in a current or hypothetical situ-
ation. In either case, a query vector describes the
context or environment in which the group is, or is
hypothesized to be, functioning. We’re interested in
what the associated action vector will be. For exam-
ple, it might tell us that the group will resort to high-
intensity bombings but won’t resort to kidnappings.
More formally, we assume the existence of some
arbitrary universe A whose elements are attri-
butes. Each attribute A
i
has an associated domain