Proceedings of the 1999 Winter Simulation Conference P. A. Farrington, H. B. Nembhard, D. T. Sturrock, and G. W. Evans, eds. MONKEYS, GAMBLING, AND RETURN TIMES: ASSESSING PSEUDORANDOMNESS Stefan Wegenkittl Department of Mathematics University of Salzburg A 5026 Salzburg, AUSTRIA ABSTRACT We present a general construction kit for empirical tests of pseudorandom number generators which comprises a wide range of well-known standard tests. Within our setup we identify two important families of tests and check for connections between them. This leads us to quiery the existence of universal tests which claim to be able to detect any possible defect of a generator. 1 INTRODUCTION Whereas the art of constructing pseudorandom number generators (PRNGs, see Knuth 1997, L’Ecuyer 1994, and Hellekalek 1998a for overviews) is that of carefully hiding the deterministic nature of the afterwards presumed random numbers, the art of empirical testing is to find the hidden correlations and to analyze their impact on simulation stud- ies and Monte Carlo algorithms. Following Marsaglia and Zaman (1993), a good PRNG produces an output which does not differ significantly from that of a (memoryless and fair) monkey hitting keys on a numeric keyboard. Theoretical tests (Hellekalek 1998b, Niederreiter 1992, 1995) - as they are often provided by the authors of a PRNG themselves - usually ensure the quality of the sample space which is the set of all possible realizations that can be obtained from the generator. We are left to empirical testing - in which the PRNG is treated as a black box - for gaining confidence in that the samples will suit the needs of our application. Here, we try to remodel important features of the target application in a test statistic and to seek for any “non-monkeyness” in the corresponding test results. The huge amount of empirical tests presented in lit- erature in various setups and styles (see Bratley, Fox, and Schrage 1983, Knuth 1997, and L’Ecuyer 1992 for sur- veys, and Altman 1988, Bernhofen et. al. 1996, DeMatteis and Pagnutti 1995, Dudewicz et. al. 1995, Eichenauer- Herrmann, Herrmann, and Wegenkittl 1997, Entacher, Uhl, and Wegenkittl 1998, Ferrenberg, Landau, and Wong 1992, Marsaglia 1985, and Vattulainen, Ala-Nissila, and Kankaala 1994, 1995 for examples) makes it difficult to rate on the differences and redundancies within these batteries. We present a construction kit which unifies many well-known approaches in a general setup in Section 2 and analyze two important families of tests in Sections 3 and 4. We will see that these families are strongly connected by the notion of entropy. We consider a scale ranging from highly specific to rather universal tests in Section 5 and examine the necessity of both types of tests. 2 CONSTRUCTION OF TESTS The construction kit in Figure 1 exhibits the major building blocks of empirical testing procedures. From top downwards we have two input modules, the PRNG and the monkey, three feature extraction modules and a comparison module. The latter, C, is used to measure the extent of “non-monkeyness” of the PRNG with respect to the selected features. The test rejects the generator if the observed behavior is very unlikely to occur when replacing it by the monkey. As to the middle modules, we have • a keyboard device K for turning the pseudo- random numbers (PRNs) into letters from a finite alphabet A,#A = α, such as bit cutting mechanisms, even-odd-testing, or coin-throw simulation. Under the monkey hypothesis, the sequence of letters is assumed to be an i.i.d. uniform random sequence on A ∞ with each letter a ∈ A having a fixed probability π a . • a finite state automaton A with state space S = {1,...,m} which makes transitions according to the input letters, • and an observation unit O for reporting statis- tics on the state automata such as the number of visits in each state, see Section 3, or the return time to a certain state, see Section 4. 625