Computationally eﬃcient algorithms for the two-dimensional Kolmogorov-Smirnov test Raul H C Lopes, Peter R Hobson and Ivan D Reid School of Engineering and Design, Brunel University, Uxbridge UB8 3PH, United Kingdom E-mail: Raul.Lopes@brunel.ac.uk Abstract. Goodness-of-ﬁt statistics measure the compatibility of random samples against some theoretical or reference probability distribution function. The classical one-dimensional Kolmogorov-Smirnov test is a non-parametric statistic for comparing two empirical distributions which deﬁnes the largest absolute diﬀerence between the two cumulative distribution functions as a measure of disagreement. Adapting this test to more than one dimension is a challenge because there are 2 d -1 independent ways of ordering a cumulative distribution function in d dimensions. We discuss Peacock’s version of the Kolmogorov-Smirnov test for two-dimensional data sets which computes the diﬀerences between cumulative distribution functions in 4n 2 quadrants. We also examine Fasano and Franceschini’s variation of Peacock’s test, Cooke’s algorithm for Peacock’s test, and ROOT’s version of the two-dimensional Kolmogorov-Smirnov test. We establish a lower-bound limit on the work for computing Peacock’s test of Ω(n 2 lg n), introducing optimal algorithms for both this and Fasano and Franceschini’s test, and show that Cooke’s algorithm is not a faithful implementation of Peacock’s test. We also discuss and evaluate parallel algorithms for Peacock’s test. 1. Introduction Goodness-of-ﬁt statistics measure the compatibility of random samples against some theoretical probability distribution function. In general, given two independent stochastic variables X and Y whose cumulative distribution functions (CDFs) F and G are unknown, the classical two- sample problem consists of testing the null hypothesis H 0 : F (x)= G(x), for every x ∈ R d against the general alternative H 1 : F (x) = G(x), for some x ∈ R d This kind of problem could arise in a context where, given an observed sample X 1 ,...,X n and a reference sample Y 1 ,...,Y m , one must determine whether they come from the same distribution function. The nature of the sets is, however, important in deﬁning the kind of test available. In particular, an important consideration is whether the data are available as discrete points or have been binned into histograms. A well accepted test for binned distributions is based on the χ 2 statistic[1]. Continuous data can always be binned by grouping the events into ranges, but this usually comes at the price of losing information. International Conference on Computing in High Energy and Nuclear Physics (CHEP’07) IOP Publishing Journal of Physics: Conference Series 120 (2008) 042019 doi:10.1088/1742-6596/120/4/042019 c  2008 IOP Publishing Ltd 1