Proceedings of the 2007 Winter Simulation Conference S. G. Henderson, B. Biller, M.-H. Hsieh, J. Shortle, J. D. Tew, and R. R. Barton, eds. ANALYSIS AND GENERATION OF RANDOM VECTORS WITH COPULAS Johann Christoph Strelen Rheinische Friedrich–Wilhelms–Universit¨ at Bonn omerstr. 164, 53117 Bonn, GERMANY Feras Nassaj Fraunhofer Institute for Applied Information Technology FIT Schloss Birlinghoven 53754 Sankt Augustin, GERMANY ABSTRACT Copulas are used in finance and insurance for modeling stochastic dependency. They comprehend the entire depen- dence structure, not only the correlations. Here they are estimated from measured samples of random vectors. The copula and the marginal distributions of the vector elements define a multivariate distribution of the sample which can be used to generate random vectors with this distribution. This can be applied as well to time series. A programmed algorithm is proposed. It is fast and allows for random vectors with high dimension, for example 100. 1 INTRODUCTION Stochastic models and discrete simulation are indispensable means for the quantitative analysis of systems. It is well known that missing to carefully model the influences from outside, especially the load, may lead to wrong results and ultimately to wrong decisions based on the simulation re- sults. One reason for bad load models may be to ignore dependencies, i.e. to use independent random variables in- stead of proper commonly distributed random vectors or stochastic processes. Influence from outside of the model like load or failure of system components can be incorporated into the model using observed traces or input models, namely random variables, random vectors, or stochastic processes. Data from traces can be used directly. If input is modeled, data are realisations of the model. The use of random variates is well understood and common since long time, the use of generated random vectors and stochastic processes is much more difficult, not so popular, a topic of current research. The use of copulas is common in finance and insurance. In this paper, we propose to use copulas for the analysis of observed data and for the generation of dependent random variates and time series. The copula of a multivariate distribution describes its dependence structure completely, not only the correlations of the random variables. It is uncoupled from the marginal distributions which can be modeled as empirical distributions or fitted standard distributions as usual. The use of copulas makes a difficult task, finding a mul- tivariate distribution, more facile by performing two easier tasks. The first step is modeling the marginal distributions, the second consists in estimating the copula. Once we eval- uated the estimated copula and marginal distributions, it is quite simple to use them to generate random vectors. We model the marginal distributions as usual, and es- timate the copula from a frequency distribution. This is not common, usually one of the known families of copulas is fitted to the sample. There are many such families, see e.g., Nelsen (1998), but the most for only two dimensions. For simulation, more dimensions might be needed. Moreover, as remarked in Blum, Dias, and Embrechts (2002), fitting a sample to a family of copulas is essentially as difficult as estimating the joint distribution in the first place. Thirdly, different families of copulas account for different kinds of dependence. Hence, the input modeler must choose the family according to the actual dependence nature. In con- trast, an empirical copula incorporates the dependence form automatically. For these reasons, we use some kind of em- pirical copulas instead of fitting the samples to families of copulas. The new technique contrasts with other proposed input models. Autoregressive processes (AR) model time series with Gaussian random variables. They are conveniently fitted to measured data with the linear Yule-Walker equations. ARTA-like models (ARTA, Cario and Nelson 1996) for univariate time-series, NORTA (Cario and Nelson 1997) for random vectors, VARTA (Biller and Nelson 2003) for processes of random vectors) allow for general distributions by means of a Gaussian AR or a multivariate Gaussian random variable as basis whose random variables are trans- formed into the desired distributions. The correlations of the basis process are different from the desired correlations. 488 1-4244-1306-0/07/$25.00 ©2007 IEEE