Computational Statistics and Data Analysis 56 (2012) 2097–2111 Contents lists available at SciVerse ScienceDirect Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda Cramér–von Mises and characteristic function tests for the two and k-sample problems with dependent data Jean-François Quessy , François Éthier Département de mathématiques et d’informatique, Université du Québec à Trois-Rivières, Trois-Rivières (QC) Canada, G9A 5H7 article info Article history: Received 19 May 2011 Received in revised form 20 December 2011 Accepted 21 December 2011 Available online 3 January 2012 Keywords: Characteristic function Copula Dependent data Empirical processes Multiplier central limit theorem Two and k-sample problems abstract Statistical procedures for the equality of two and k univariate distributions based on samples of dependent observations are proposed in this work. The test statistics are L 2 distances of standard empirical and characteristic function processes. The p-values of the tests are obtained from a version of the multiplier central limit theorem whose asymptotic validity is established. Simple formulas for the test statistics and their multiplier versions in terms of multiplication of matrices are provided. Simulations under many patterns of dependence characterized by copulas show the good behavior of the tests in small samples, both in terms of their power and of their ability to keep their nominal level under the null hypothesis. © 2012 Elsevier B.V. All rights reserved. 1. Introduction The two-sample and k-sample problems are classical in statistics. In the general k 2 setting, it involves testing for H 0 : F 1 =···= F k against H 1 : F j = F j for some j, j ∈{1,..., k}, where F 1 ,..., F k are distribution functions (either univariate or multivariate). This topic has been investigated by several authors, especially in the case k = 2 and when the distributions are univariate. So far, this issue has been considered almost exclusively under the assumption of independent samples. In that case, most of the classical testing procedures, including the Wilcoxon–Mann–Whitney, Kolmogorov–Smirnov and Cramér–von Mises type statistics, are marginal-free; this allows for an easy computation of critical values by way of Monte-Carlo simulations. For k = 2, recent contributions include that of Freitag et al. (2007) based on the Mallows distance, Bajorunaite and Klein (2007) for the equality of cumulative incidence functions, John and Priebe (2007) based on a weighted generalized Mann–Whitney–Wilcoxon statistic, and Neubert and Brunner (2007) for a studentized permutation test. For the general k-sample problem, the first contributions are those of Kiefer (1959) and Bickel (1968), who generalized the use of the Kolmogorov–Smirnov and the Cramér–von Mises statistics; the idea was later extended to the Anderson–Darling functional by Scholz and Stephens (1987). More recent works are those of Wylupek (2010) and Zhang and Wu (2007) using data-driven and likelihood ratio based tests, respectively, and Martínez-Camblor and de Uña-Álvarez (2009) based on kernel density estimates. However, the validity of most of the existing procedures no longer holds when the samples are dependent. The reason is that although they are still free of the unknown (common) distribution function under H 0 , their behavior depends on the unknown dependence structure. This causes an obvious problem for the computation of valid p-values under any kind of Correspondence to: Département de mathématiques et informatique, Université du Québec à Trois-Rivières, P.B. 500, Trois-Rivières, Canada, G9A 5H7. E-mail addresses: Jean-Francois.Quessy@uqtr.ca (J.-F. Quessy), Francois.Ethier@uqtr.ca (F. Éthier). 0167-9473/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.csda.2011.12.021