Maximizing the conditional overlap in business surveys Ioana Schiopu-Kratina a , Jean-Marc Fillion b , Lenka Mach c,n , Philip T. Reiss d,e a Department of Mathematics and Statistics, University of Ottawa, 585 King Edward, Ottawa, Ontario, Canada K1N 6N5 b Business Survey Methods Division, Statistics Canada, Ottawa, Ontario, Canada K1A 0T6 c Social Survey Methods Division, Statistics Canada, Ottawa, Ontario, Canada K1A 0T6 d Department of Child & Adolescent Psychiatry and Department of Population Health, New York University School of Medicine, 1 Park Avenue, 7th Floor, New York, NY 10016, USA e Nathan S. Kline Institute for Psychiatric Research, 140 Old Orangeburg Road, Orangeburg, NY 10962, USA article info Article history: Received 10 April 2012 Received in revised form 3 February 2014 Accepted 4 February 2014 Available online 19 February 2014 Keywords: Sample coordination Stratified SRSWOR Linear programming Expected sample overlap Row error variance abstract This article presents novel sequential methods of sample coordination appropriate for a repeated survey, with a stratified design and simple random sampling without replace- ment (SRSWOR) selection within each stratum, when the composition or definition of strata changes. Such changes could be the result of updating the frame for births, deaths, or the modification of the industry classification system. Given that a sample has already been selected according to a first (before the frame updates) SRSWOR design, our general aim is to select a minimum number of new units for the second (after the updates) survey while preserving the first-order inclusion probabilities of units in the second SRSWOR design. Sequential methods presently in use can attain a large expected overlap, but do not control the overlap on each pair of selected samples. In this article we present a set of new methods for maximizing the expected overlap, which can handle realistic situations when strata and the associated sample sizes are large. These methods include one that not only maximizes the expected overlap but, for any initially selected sample, maximizes its overlap with the second sample; its superior performance is illustrated with numerical examples. & 2014 Elsevier B.V. All rights reserved. 1. Introduction 1.1. Motivation for sample coordination Statistical agencies often need to control the overlap of samples drawn from overlapping populations. For instance, maximizing the overlap (positive sample coordination) can increase the precision of estimates of change between two occasions of a repeated survey and reduce the costs of first contacts. Minimizing the overlap (negative sample coordination) aims at minimizing the number of surveys for which the same unit is selected, to avoid overburdening of individual respondents. Maximizing the expected overlap of samples has been a topic of interest for many years. Raj (1956) considered maximizing the expected overlap of sites (villages) to be visited by interviewers to collect information for two surveys. In this case, the overlap is controlled at the level of primary sampling units (PSUs) and the resulting design minimizes the Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/jspi Journal of Statistical Planning and Inference http://dx.doi.org/10.1016/j.jspi.2014.02.002 0378-3758 & 2014 Elsevier B.V. All rights reserved. n Corresponding author. Tel.: þ1 613 951 4754. E-mail address: Lenka.Mach@statcan.gc.ca (L. Mach). Journal of Statistical Planning and Inference 149 (2014) 98–115