Extrapolation of an Optimal Policy using Statistical Probabilistic Model Checking Artur Rataj 1 and Bo˙ zena Wo´ zna-Szcze´ sniak 2 1 IITiS, Polish Academy of Sciences, ul. Ba ltycka 5, 44-100 Gliwice, Poland arturrataj@gmail.com 2 IMCS, Jan Dlugosz University Al. Armii Krajowej 13/15, 42-200 Cz¸ estochowa, Poland. b.wozna@ajd.czest.pl Abstract. We show how to extrapolate an optimal policy controlling a model, which is itself too large to find the policy directly using probabilis- tic model checking (PMC). In particular, we look for a global optimal resolution of non–determinism in several small Markov Decision Pro- cesses (MDP) using PMC. We then use the resolution to find a respec- tive set of decision boundaries representing the optimal policies found. Then, a hypothesis is formed on an extrapolation of these boundaries to an equivalent boundary in a large MDP. The resulting hypothetical extrapolated decision boundary is statistically approximately verified, whether it indeed represents an optimal policy for the large MDP. The verification either weakens or strengthens the hypothesis. The criterion of the optimality of the policy can be expressed in any modal logic that includes the probabilistic operator Pp[·], and for which a PMC method exists. Keywords: probabilistic model checking, statistical model checking, non–determinism, optimal policy, extrapolation. 1 Introduction Probabilistic model checking (PMC) [4] refers to a range of techniques for a formal analysis of a stochastic system, which is usually a state transition system with transitions labelled by probability values. A policy of a decision maker (an agent), controlling a Markov Decision Pro- cess (MDP), resolves a non–deterministic choice, which exist in each state of an MDP in the form of a number of probability distributions over states, of which one is arbitrarily chosen (for details see Sec. 2). An optimal policy [12], in re- spect to a given property, may in particular correspond to either the minimum or maximum value of the property. In this paper we consider an MDP with properties specified in any modal logic that includes the probabilistic operator P p [·], for which exists a PMC method. A common example of such a logic, for which efficient model checkers exist, is Probabilistic Computation Tree Logic (PCTL) [3].