Learning equilibria in constrained Nash-Cournot games with
misspecified demand functions
Hao Jiang, Uday V. Shanbhag and Sean P. Meyn
Abstract— We consider a constrained Nash-Cournot
oligopoly where the demand function is linear. While cost
functions and capacities are public information, firms only
have partial information regarding the demand function.
Specifically, firms either know the intercept or the slope of
the demand function and cannot observe aggregate output.
We consider a learning process in which firms update their
profit-maximizing quantities and their beliefs regarding the
unknown demand function parameters, based on disparities
between observed and estimated prices. A characterization of
the mappings, corresponding to the fixed point of the learning
process, is provided. This result paves the way for developing
a Tikhonov regularization scheme that is shown to learn the
correct equilibrium, in spite of the multiplicity of equilibria.
Despite the absence of monotonicity of the gradient maps, we
prove the convergence of constant and diminishing steplength
distributed gradient schemes under a suitable caveat on the
starting points. Notably, precise rate of convergence estimates
are provided for the constant steplength schemes.
I. I NTRODUCTION
The Nash solution concept [7] has been extensively an-
alyzed and applied in economics, engineering and applied
sciences and finds relevance in the examination of strategic
behavior in noncooperative games. In such settings, the Nash
equilibrium is a tuple of strategies from which no player can
profit from unilaterally deviating. In this paper, we consider
a deterministic Nash-Cournot game, in which a common
homogeneous commodity is being produced by several firms
and its price is specified completely by a function of the
aggregate output. In such a game, the ith player solves
Opt(x
−i
), defined as
min f
i
(x; θ)
c
i
(x
i
) − p(X; θ)x
i
subject to x
i
∈ K
i
,
where x (x
1
,...,x
N
)
T
, x
i
denotes the output of firm i,
c
i
(·) denotes firm i’s cost function, and K
i
[0,Cap
i
] with
Cap
i
being the capacity of firm i. The price function of the
commodity, denoted by p(X; θ), is defined as
p(X; θ) a
∗
− b
∗
X,
where X =
∑
N
i=1
x
i
and θ =(a
∗
,b
∗
). The associated Nash-
Cournot equilibrium is given by a tuple x
∗
=(x
∗
i
)
N
i=1
where
x
∗
i
∈ SOL(Opt(x
∗
−i
)) for i =1,...,N, SOL(Opt(x
∗
−i
))
denotes the solution of Opt(x
−i
) and x
−i
=(x
j
)
j=i
.
Jiang and Shanbhag are with the Department of Industrial and Enterprise
Systems Engineering while Meyn is in Department of Electrical and
Computer Engineering, both at the University of Illinois, Urbana IL 61801.
They are contactable at Email: {jiang23,udaybag,meyn}@illinois.edu. This
work has been supported by DOE award DE-SC0003879.
Cournot models predate the Nash solution concept and
a host of variants have been analyzed [8], [9]. An oft-
used assumption in game-theoretic models is one which
requires that player payoffs are public knowledge and every
player is able to forecast the choices of his adversaries.
As noted by Kirman [5], a firm’s information sets may be
incomplete as manifested by a regime where firms have
imperfect information of the payoffs of their adversaries. In a
Cournot setting, firms may have an incorrect specification of
the demand function. Naturally, firms can ascertain that their
estimates differ from observations, leading to an adjustment
process. In effect, firms learn the parameters of the game
while participating in the game.
Our work is inspired by a series of papers by Szidarovszky,
Bischi and their coauthors [2], [3], [10] where firms com-
peting in a Nash-Cournot attempt to learn a parameter of
the demand function while playing the game. In [1], in an
unconstrained regime with linear costs, the authors examine
the stability of learning the equilibrium and one of the
unknown parameters of θ (either a
∗
or b
∗
). In particular,
they consider two cases:
(Case 1): The slope b
∗
is known, but a
∗
is unknown.
(Case 2): a
∗
is known, but the slope b
∗
is unknown.
It is shown that this process is globally stable for case 1 and
unstable when considering case 2.
In this paper, we consider the learning of equilibria when
one component of θ is unknown and the aggregate output X
is unobservable by the firms. In particular, if b
∗
and X are
unknown, then our goal lies in developing algorithms that
construct a sequence z
k
=(x
k
,b
k
) such that
lim
k→∞
z
k
= z
∗
,
where z
∗
=(x
∗
,b
∗
).
Broadly speaking, our focus is on constrained Nash-
Cournot problems; such an extension is not a trivial one in
that gradient-based learning now involves introduces the use
of a projection operator. In such a regime, we prove that
the mappings associated with the variational problems are
P and P
0
maps for cases 1 and 2, respectively. Notably,
while such a variational problem has a unique solution in the
context of Case 1 while such uniqueness cannot be claimed
when learning b
∗
(Case 2). Despite this lack of unique-
ness, we develop a Tikhonov regularization scheme that is
guaranteed to converge to the correct equilibrium, under
suitable conditions. The convergence of standard gradient-
based distributed schemes cannot be immediately claimed
since the mappings are not monotone (but admit a weaker
2011 50th IEEE Conference on Decision and Control and
European Control Conference (CDC-ECC)
Orlando, FL, USA, December 12-15, 2011
978-1-61284-799-3/11/$26.00 ©2011 IEEE 1018