Empirical Power and Sample Size Calculations for Cluster-Randomized and Cluster-Randomized Crossover Studies Nicholas G. Reich 1 *, Jessica A. Myers 2 , Daniel Obeng 3 , Aaron M. Milstone 4 , Trish M. Perl 5 1 Division of Biostatistics and Epidemiology, University of Massachusetts, Amherst, Massachusetts, United States of America, 2 Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America, 3 Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States of America, 4 Department of Pediatrics, Division of Pediatric Infectious Diseases, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America, 5 Department of Medicine, Division of Infectious Diseases, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America Abstract In recent years, the number of studies using a cluster-randomized design has grown dramatically. In addition, the cluster- randomized crossover design has been touted as a methodological advance that can increase efficiency of cluster- randomized studies in certain situations. While the cluster-randomized crossover trial has become a popular tool, standards of design, analysis, reporting and implementation have not been established for this emergent design. We address one particular aspect of cluster-randomized and cluster-randomized crossover trial design: estimating statistical power. We present a general framework for estimating power via simulation in cluster-randomized studies with or without one or more crossover periods. We have implemented this framework in the clusterPower software package for R, freely available online from the Comprehensive R Archive Network. Our simulation framework is easy to implement and users may customize the methods used for data analysis. We give four examples of using the software in practice. The clusterPower package could play an important role in the design of future cluster-randomized and cluster-randomized crossover studies. This work is the first to establish a universal method for calculating power for both cluster-randomized and cluster-randomized clinical trials. More research is needed to develop standardized and recommended methodology for cluster-randomized crossover studies. Citation: Reich NG, Myers JA, Obeng D, Milstone AM, Perl TM (2012) Empirical Power and Sample Size Calculations for Cluster-Randomized and Cluster- Randomized Crossover Studies. PLoS ONE 7(4): e35564. doi:10.1371/journal.pone.0035564 Editor: Sten H. Vermund, Vanderbilt University, United States of America Received October 17, 2011; Accepted March 19, 2012; Published April 27, 2012 Copyright: ß 2012 Reich et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: DO, AAM, TMP were funded in part by grants from Sage Products, Inc. NGR and TMP were funded by the ResPECT study (clinicaltrials.gov ID: NCT01249625) through an interagency agreement between the Centers for Disease Control and the United States Department of Veterans Affairs (CDC IAA# 09FED905876). The findings and conclusions in this manuscript are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: This study was partly funded by Sage Products, Inc. There are no patents, products in development or marketed products to declare. This does not alter the authors’ adherence to all the PLoS ONE policies on sharing data and materials. * E-mail: nick@schoolph.umass.edu Introduction Clinical trials are often designed to assess the effectiveness of a particular intervention. While evidence from individually-ran- domized, masked clinical trials is considered the gold standard of scientific evidence, in many settings, such a design is not feasible and, sometimes, is unethical. Cluster-randomized trials randomize groups of people instead of individuals. These studies can be valuable tools for evaluating interventions that are best imple- mented at the group level. Some have argued that a cluster- randomized design yields more accurate estimates of the treatment effect of interest because the treatment effect is estimated on the level at which the intervention is applied [1]. Looking forward, cluster-randomized designs will continue to play an important role in clinical effectiveness research, filling in when individually randomized studies are not possible. Many questions remain about best practices for cluster- randomized studies. A variant to the cluster-randomized study design, the cluster-randomized crossover design, has been touted as a methodological advance that can increase efficiency of cluster- randomized studies in certain situations. While the cluster- randomized crossover trial has become a popular tool, standards of design, analysis, reporting and implementation have not been established for this emergent design. This is largely due to the fact that the principles from cluster-randomized trials with no crossover are not easily applied to a crossover setting. The crossover introduces a significant paradigm change in analyzing cluster-randomized data. In a cluster-randomized crossover trial, statistical inference is based on evidence drawn from within-cluster comparisons. In standard cluster-randomized trials, between- cluster comparisons provide the evidence. Therefore, techniques for analyzing data from cluster-randomized crossover trials are very different from those used to analyze data from cluster- randomized trials with no crossover. In this paper, we discuss a single aspect of designing cluster- randomized and cluster-randomized crossover trials: estimating statistical power. Many scientific studies set out to gather evidence that can be used to evaluate a specific hypothesis. An investigator PLoS ONE | www.plosone.org 1 April 2012 | Volume 7 | Issue 4 | e35564