JULY/AUGUST 2008 1541-1672/08/$25.00 © 2008 IEEE 51 Published by the IEEE Computer Society C o m p u t a t i o n a l C u l t u r a l D y n a m i c s C ONVEX : Similarity- Based Algorithms for Forecasting Group Behavior Vanina Martinez, Gerardo I. Simari, Amy Sliva, and V.S. Subrahmanian, University of Maryland College Park Two classes of algorithms based on vector similarity predict a group’s behavior. These algorithms are extremely fast and highly accurate. M any applications could beneft from accurately predicting an entity’s behavior. For example, researchers have developed methods to predict a terrorist orga- nization’s probable actions (such as bombings or kidnappings). 1,2 Likewise, we might be interested in predicting whether a government will raise taxes of various types (property taxes, income taxes, and so on). Most of these pre- dictions are based on indicators correlated with the actions being predicted. However, in many cases we don’t know these indicators a priori. For example, consider the Mi- norities at Risk Organizational Behavior (MAROB) research project. 3 The MAROB data set consists of information about ethnopolitical groups at risk of falling into terrorism or already engaged in it. MA- ROB has identifed about 284 variables to be tracked on a yearly basis for each of more than 280 eth- nopolitical groups worldwide. A handful of these variables represent actions the group has taken; the others represent the environment or context in which the group functions. The ontology generated is quite shallow in contrast to the deep ontologies in Semantic Web approaches such as RDF or OWL. We’re interested in using data on past group be- havior—particularly data related to the context and situation—to predict the group’s actions. To make such predictions, we developed a computational theory and the CONVEX families of algorithms. Compared to previous prediction approaches (see the “Related Work in Predicting Group Behavior” sidebar for some examples), these vector-based al- gorithms are extremely fast. They’re also accurate, and their results are easily explainable. A formal vector model of group behavior Suppose we have data covering a group’s behavior over many years. Each past behavior is a pair of two vectors. The context vector contains the values of the context variables associated with the group, in- cluding information about actions taken by other organizations that affect the group. The action vec- tor contains the values of the action variables asso- ciated with that group, refecting actions the group has engaged in. Suppose that a user wants to identify what the group might do in a current or hypothetical situ- ation. In either case, a query vector describes the context or environment in which the group is, or is hypothesized to be, functioning. We’re interested in what the associated action vector will be. For exam- ple, it might tell us that the group will resort to high- intensity bombings but won’t resort to kidnappings. More formally, we assume the existence of some arbitrary universe A whose elements are attri- butes. Each attribute A i has an associated domain