Received: 31 May 2018 Revised: 10 April 2019 Accepted: 11 April 2019 DOI: 10.1002/pst.1950 MAIN PAPER Controlling type I error in the reference-scaled bioequivalence evaluation of highly variable drugs Jordi Ocaña 1 Joel Muñoz 2 1 Department of Genetics, Microbiology and Statistics, Universitat de Barcelona, Barcelona, Spain 2 Faculty of Physical and Mathematical Sciences, Department of Statistics, University of Concepcion, Concepcion, Chile Correspondence Jordi Ocaña, Faculty of Biology, Department of Genetics, Microbiology and Statistics, Av. Diagonal 643, 08028 Barcelona, Spain. Email: jocana@ub.edu Funding information Ministerio de Economía y Competitividad (Spain), Grant/Award Number: Grant MTM2015-64465-C2-1-R (MINECO/FEDER); Generalitat de Catalunya, Grant/Award Number: 2017 SGR 622 Reference-scaled average bioequivalence (RSABE) approaches for highly vari- able drugs are based on linearly scaling the bioequivalence limits according to the reference formulation within-subject variability. RSABE methods have type I error control problems around the value where the limits change from con- stant to scaled. In all these methods, the probability of type I error has only one absolute maximum at this switching variability value. This allows adjust- ing the significance level to obtain statistically correct procedures (that is, those in which the probability of type I error remains below the nominal significance level), at the expense of some potential power loss. In this paper, we explore adjustments to the EMA and FDA regulatory RSABE approaches, and to a pos- sible improvement of the original EMA method, designated as HoweEMA. The resulting adjusted methods are completely correct with respect to type I error probability. The power loss is generally small and tends to become irrelevant for moderately large (affordable in real studies) sample sizes. KEYWORDS confidence interval inclusion principle, point estimate constraint, scaled average bioequivalence 1 INTRODUCTION Generic drug products contain the same active substance as brand drugs, but under a different formulation. Provided that the test T (generic) formulation and the reference R (brand, innovator) formulation contain an equal quantity of the active substance (whose safety and therapeutic value was already demonstrated through a long and expensive clinical trial), the bioequivalence between them is defined in terms relative rate and absorption of the active substance, as judged by comparing plasma concentration curves after a single administration of T or R. For each subject participating in the study, and for each administration, these concepts are characterized by means of variables like the area under the curve (AUC) or the maximum concentration reached (Cmax), both of which are computed from the resulting plasma concentration vs time curve. According to the criteria of regulatory agencies like the US Food and Drug Administration (FDA) and the European Medicines Agency (EMA), bioequivalence holds when the ratio of the geometric means of the bioavailabilities of T and R falls within the interval 0.80 to 1.25 (=1/0.80). This criterion is usually expressed in logarithmic scale. Then, for variables like log(AUC) or log(Cmax), the difference in means between R and T (the formulation effect, ) must be within the limits ±0.223, where 0.223 = log(1.25)=- log(0.80). In inferential statistical terms, to demonstrate bioequivalence is assimilated to rejecting a null hypothesis of bioinequivalence in favour of an alternative of bioequivalence: H 0 ∶  ≤ -0.223 ∨  ≥ 0.223 H 1 ∶-0.223 << 0.223. (1) Pharmaceutical Statistics. 2019;1–17. wileyonlinelibrary.com/journal/pst © 2019 John Wiley & Sons, Ltd. 1