1 Department of Philosophy, Virginia Tech; email: mayod@vt.edu 2 School of Psychology, Cardiff University; email: moreyr@cardiff.ac.uk A Poor Prognosis for the Diagnostic Screening Critique of Statistical Tests D Mayo 1 and R D Morey 2 It is well known that in interpreting the results of a statistical test, one should not confuse Pr(test rejects H0|H0 is false) with Pr(H0 is false|test rejects H0). A popular criticism of significance tests assumes that the latter probability, often called the positive predictive value (PPV), is the appropriate measure of the warrant for “H0 is false,” understood as “there’s a real effect” or “there’s a genuine discrepancy from H0”. We argue that a high PPV for a hypothesis doesn’t align with its being well- tested, plausible, or warranted. This can be seen from the perspective of P-values, likelihood ratios or confidence intervals. The assumption that it is valid to marshal elements from significance tests in order to compute a PPV threatens to lead to further misunderstanding and misuse of this statistical methodology. 1. Introduction and Overview As cases of high profile failures of replication in science mount, many fields find themselves in a state of introspection on statistical methodology. No method has received more attention than statistical significance testing, e.g., the American Statistical Association’s (ASA) Statement on P-values (Wasserstein and Lazar 2017). In the felt urgency to proffer reforms to restore scientific credibility, there has been inadequate examination of equivocal terms being bandied about. Most practitioners are content to use an eclectic set of methods — all the more reason not to expect the same values from methods that measure different things. Yet one of the more popular accounts intended as a reform or a replacement for significance tests conflates concepts from Bayesian and frequentist statistics. We take no position on Bayesian-frequentist debates; all positions are hurt by the present lack of clarity on fundamental issues. We aim to restore clarity to a corner of the debate over significance testing. Overview There is some irony in the fact that failures of replication are often discovered by means of the very method often blamed: statistical significance tests.