EISEVIER EUROPEAN JOURNAL OF OPERATIONAL RESEARCH European Journal of Operational Research 8tJ (1996) 622-636 Theory and Methodology A K-step look-ahead analysis of Value Iteration algorithms for Markov decision processes Meir Herzberg "'" , Uri Yechiali t' Telecom Australia Resettrch Lqboratories', 770 Blackbunt Rtl., Clat't<tn, l/ic. 3l(t8, Austrulia oDepartnrentofStatisticsandoPeratiotts,Resen,ii,i,j::,?!:],,::::::,:,,,,,i:;i:,,:,,,,;;;;'i Received April 1993; revised June 199-1 Abstract We introduce and analyze a general look-ahead approach for Value Iteration Algorithms used in solving Lroth discounted and undiscounted Markov decision processes. This approach, based on the value-oriented concept interwoven with multiple adaptive relaxation factors, leads to accelcrating proccdures rvhich perform better than the separate use of either the concept of vaiue oriented or of relaxation. Evaluation and computational considerations of this method are discussed, practical guidelines for implementation are suggested and the suitability of enhancing the rnethod by incorporating Phase 0, Action Elimination procedures and Parallel Proccssing is indicated. Thc method was successfully applied to several real problems. We present somc numerical results which support the superiority of the developed approach, particularly for undiscounted cases, over other Value Iteration valiants. Keywords: Markov processes; Value iteration; Modified policy iteration; Adaptive relaxation factor; l-ook-ahead analysis l. Introduction The successive substitution technique for solving Markov Decision Processes (MDPs) appears to be the best computational rnethod for solving large Markov decision models, by avoiding either dealing with huge Linear Programming models or repeatedly solving large sets of linear equations (see Tijms [23]). The classical way of using the above technique is the standard Value iteration Algorithnr (VIA), applied to both discounted and undiscounted MDPs. For discounted cases it relies on a basic recursive equation of the form v,,(i): pT,{.t * FD,P,"i. v,,-,(i)}, i e I ( 1) - Corresponding author. 0377-2217/96/515.00 e 1996 Elsevier Science B.V All rights resen'ed ssDl 03 7 7 -221 7( 9,+)00 20 B- 8