Appl Math Optim (2010) 61: 167–190
DOI 10.1007/s00245-009-9080-2
The Discounted Method and Equivalence of Average
Criteria for Risk-Sensitive Markov Decision Processes
on Borel Spaces
Rolando Cavazos-Cadena · Francisco Salem-Silva
Published online: 30 June 2009
© Springer Science+Business Media, LLC 2009
Abstract This note concerns discrete-time controlled Markov chains with Borel state
and action spaces. Given a nonnegative cost function, the performance of a control
policy is measured by the superior limit risk-sensitive average criterion associated
with a constant and positive risk sensitivity coefficient. Within such a framework, the
discounted approach is used (a) to establish the existence of solutions for the corre-
sponding optimality inequality, and (b) to show that, under mild conditions on the
cost function, the optimal value functions corresponding to the superior and inferior
limit average criteria coincide on a certain subset of the state space. The approach of
the paper relies on standard dynamic programming ideas and on a simple analytical
derivation of a Tauberian relation.
Keywords Hölder’s inequality · Contractive operators · Generalized Fatou’s
lemma · Risk-sensitive vanishing discount approach · Weak continuity
1 Introduction
This work concerns discrete-time Markov decision processes (MDPs) evolving on a
Borel space. The system is driven by a risk-averse decision maker with constant risk
Dedicated to Professor Onésimo Hernández-Lerma, on the occasion of his sixtieth birthday.
This work was supported by the PSF Organization under Grant Np. 08-06(450) and in part by
CONACYT under Grant 25357.
R. Cavazos-Cadena ( )
Departamento de Estadística y Cálculo, Universidad Autónoma Agraria Antonio Narro, Buenavista,
Saltillo, COAH 25315, Mexico
e-mail: rcavazos@uaaan.mx
F. Salem-Silva
Facultad de Matemáticas, Universidad Veracruzana, Circuito Gonzalo Aguirre Beltrán s/n, Zona
Universitaria, Xalapa, VER 91000, Mexico
e-mail: frsalem@uv.mx