IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 40, NO. 3, JUNE 2010 623
The Environment Value of an Opponent Model
Brett J. Borghetti
Abstract—We develop an upper bound for the potential perfor-
mance improvement of an agent using a best response to a model of
an opponent instead of an uninformed game-theoretic equilibrium
strategy. We show that the bound is a function of only the domain
structure of an adversarial environment and does not depend
on the actual actors in the environment. This bounds-finding
technique will enable system designers to determine if and what
type of opponent models would be profitable in a given adversarial
environment. It also gives them a baseline value with which to com-
pare performance of instantiated opponent models. We study this
method in two domains: selecting intelligence collection priorities
for convoy defense and determining the value of predicting enemy
decisions in a simplified war game.
Index Terms—Equilibrium, game theory, multiagent systems,
opponent model.
I. I NTRODUCTION
S
UPPOSE that we are planning to send a convoy through
dangerous territory in a hostile area. There is a chance that
we will encounter an ambush. We would like to minimize our
risk while ensuring that the cargo is delivered. We have two
roads to choose from—road A, which is the straightest shortest
distance to the destination, and road B, which is significantly
longer but still leads to the destination. Because of the time
sensitivity of the cargo, the travel time to the destination is im-
portant. Based on our estimates, the enemy may have planned
up to two ambushes, but we are not sure whether they have
planned any on road A, road B, or both. We also know based
on a probabilistic analysis of past events that each ambush
succeeds with probability p.
If we had access to intelligence sources, which could reveal
more information about the enemy’s activities, we would be
able to plan better our convoy. Fortunately, we do have access
to two potential intelligence sources: capability X, which we
could use to determine the number of teams the enemy has
that it could use for ambushes (0, 1, or 2), and capability Y ,
which could persistently observe road A and report if and when
the enemy sets up an ambush on that road (road B is far too
long to get complete information on). Each of these capabilities
is fairly accurate (although not perfect). Unfortunately, due to
high demand, we have only been authorized to use one of
these intelligence capabilities. Our goal is to choose which
intelligence capability (X or Y ) we want to use. This paper
Manuscript received December 21, 2008; revised July 29, 2009. First pub-
lished November 3, 2009; current version published June 16, 2010. The views
expressed in this paper are those of the author and do not reflect the official
policy or position of the U.S. Air Force, the Department of Defense, or the U.S.
Government. This paper was recommended by Associate Editor T. Vasilakos.
The author is with the Department of Electrical and Computer Engineer-
ing, Air Force Institute of Technology, Dayton, OH 45433 USA (e-mail:
brett.borghetti@afit.edu).
Digital Object Identifier 10.1109/TSMCB.2009.2033703
describes a computational method for determining answers to
such questions.
Borrowing language from intelligent agents, we can define
the enemy as an agent, the number of ambush teams that they
have available as their state, and the locations (road A or B) that
they plan to ambush as their action. We define an agent’s policy
as a function that maps its states to actions
P : S → A
where S is the set of all possible states, and A is the set of all
possible actions.
An agent model is a function that predicts something about
the agent. One class of models predicts the likelihood of each
action that the agent might take. The model takes a subset
of the features of a state as input and provides a probability
distribution over possible actions, i.e.,
M : wa
where a is a set of possible actions in the state, and w is a
probability distribution over n actions such that
∑
n
i=1
w
i
=1.
Another type of a model predicts the likelihood of the agent
being in each state that it could be in, i.e.,
M : ws
where w is a probability distribution over m states s, and
∑
m
i=1
w
i
=1.
Since intelligence capabilities X and Y both have the po-
tential to reveal information about the enemy’s activities, the
information they provide was as if it was given by opponent
models. Since intelligence capability X provides (a probability
distribution over) the number of ambush teams that the enemy
has, it predicts the enemy’s state, i.e.,
X : ws.
Y is equivalent to a model that predicts the (probability
distribution over) likelihood of an ambush team residing on
road A; it (partially) predicts the enemy’s action, i.e.,
Y : wa.
In agent-based literature, the active entities are the agents,
and the world in which they perceive and take action is some-
times called the environment. Mathematically, the environment
is a function that maps states (of the world) and actions (of the
agents) to successor states, i.e.,
Env : S
t
× A
t
→ S
t+1
.
Note that environments are often nondeterministic: The
mapping from states and actions to successor states could be
1083-4419/$26.00 © 2009 IEEE