Statistical power of an information-based test and its application to wave height data Ahmet Sezer, Senay Asma n Department of Statistics, Faculty of Science, Anadolu University, 26470 Eskisehir, Turkey article info Article history: Received 16 September 2009 Received in revised form 3 March 2010 Accepted 8 March 2010 Keywords: Hypothesis testing Information measure Power curves Likelihood ratio test Wave height data abstract Modeling wave heights is crucial for many maritime applications. An appropriate statistical distribution for describing wave heights is the Rayleigh distribution. Estimating and testing the significance of the parameters of a distribution are important in statistical modeling and allow meaningful predictions about uncertain events. We propose an information-based method to test the significance of parameters of a given one-dimensional distribution. The power of the proposed test is compared to that of the likelihood ratio tests for hypotheses on the parameters of the exponential and Rayleigh distributions. Monte Carlo simulations demonstrate that the proposed method yields a satisfactory power level that is comparable to that of the likelihood ratio test. The method is illustrated using real wave height data. & 2010 Elsevier Ltd. All rights reserved. 1. Introduction Wave height data are essential for wave analysis and ocean wave forecasting and valuable in global wave climatology. The importance of good estimates of wave patterns for the construc- tion of maritime structures has been highlighted by many researchers (Muraleedharan et al., 2007). For a short-term scale (a few hours), Longuet-Higgens (1952) showed that the Rayleigh distribution is the most appropriate to describe the distribution of wave heights. Many other studies assumed that surface wave heights generally follow a Rayleigh distribution (e.g., Wiberg and Sherwood, 2008). On the other hand, wave height distribution on a long-term scale (years) has been investigated by numerous authors. Battjes (1972) pointed out that symmetric distributions, such as the normal distribution, were not suitable to describe the long-term distribution of wave heights. Indeed, skewed distribu- tions, such as the Gumbel and Weibull distributions, fit much better (Rossouw, 1988; Goda and Kobune, 1990; Teng et al., 1993; Van Vledder et al., 1993; Burcharth and Liu, 1994; Teng and Palao, 1996; Ferreira and Gueddes Soares, 1999). Once a distribution has been selected, its parameters need to be estimated. Testing the significance of the parameters is also important in statistical modeling because it allows meaningful predictions about uncertain events. Here we propose an information-based test for testing hypotheses regarding the parameters of a given one- dimensional distribution. We compare the proposed test to the likelihood ratio to determine its performance. Power curves are obtained for different parameter values, sample sizes and alternative hypotheses (one- and two-sided alternatives) for the exponential and Rayleigh distributions. Because of its popularity, the Rayleigh distribution was chosen to illustrate the method using wave height data in marine climatology. The remainder of the paper is organized as follows. The like- lihood ratio and information-based tests are described in Section 2. Monte Carlo simulation results are presented in Section 3. The methods are applied to real data for wave heights in Section 4. A discussion and conclusions are presented in Section 5. 2. Hypothesis testing In this section, we describe the likelihood ratio test (LRT) and an information-based test (IT) for testing hypotheses on the parameters of a given one-dimensional distribution. 2.1. Likelihood ratio test Let x 1 , x 2 , ..., x n denote n independent random variables with probability density functions f(x i ; y), where i ¼ 1,y,n and y ¼ (y 1 ,y,y n ) is the parameter vector. The set that consists of all parameter points is denoted by O and O 0 is a subset of the parameter space O CR. We might wish to test the null hypothesis H 0 : yAO 0 against alternative hypotheses. We define the likelihood Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/cageo Computers & Geosciences 0098-3004/$ - see front matter & 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.cageo.2010.03.015 n Corresponding author. Tel.: +90 222 3350580x4670x4688; fax: + 90 222 320 4910. E-mail addresses: a.sezer@anadolu.edu.tr (A. Sezer), senayyolacan@anadolu.edu.tr (S. Asma). Computers & Geosciences 36 (2010) 1316–1324