Deciphering P values: Beware false certainty IN HIS POLICY Forum “Aligning statisti- cal and scientific reasoning” (3 June, p. 1180), S. N. Goodman discussed a recent statement by the American Statistical Association highlighting the persistent misapplication of the P value and its meaning in the scientific community (1). Statistical misinterpretations in sci- ence also undermine the development of robust evidenced-based policy and management [e.g., (2, 3)]. Findings of nonsignificance, in the absence of context, may lead to a false certainty that no impact occurs (4, 5). In the field of ecology, false certainty that a human activity (e.g., release of nutrients, heavy metals, or novel compounds) has no effect on species could lead to plan- ning decisions that cause adverse species interactions (6) False certainty that a recently arrived invader is “safe” and unlikely to cause harm could lead to the degradation of ecosystems (3, 7). Goodman suggests that scientists combine the use of P values with context to establish more robust thresholds. The appropriate use and reporting of a priori and post hoc power analyses will provide relevant context to a P value (2, 3, 5, 6). Editors should require authors to provide such context. Nonsignificant tests should be clearly identified as inconclusive; they cannot shed light on impact or effect where power is unduly low (8). Moreover, nonsignificant find- ings should be published only when accompanied by associated analyses of power, particularly for small effect size or small sample sizes (5). We must com- municate clearly to policy-makers that the absence of evidence of impact is not equivalent to the absence of impact. Chad L. Hewitt, 1 Marnie L. Campbell, 1 Alisha Dahlstrom Davidson, 2 1 University of Waikato, Hamilton, 3240, New Zealand. 2 Wayne State University, Detroit, MI 48202, USA. *Corresponding author. Email: chewitt@waikato.ac.nz REFERENCES 1. R. L. Wasserstein, N. A. Lazar, Am. Stat. 10.1080/00031305.2016.1154108 (2016). 2. B. D Mapstone, Ecol. Appl. 5, 401 (1995). 3. A. D. Davidson, C. L. Hewitt, Biol. Inv. 16, 1165 (2014). 4. T. Page, Ecol. Law. Q. 7, 207 (1978). 5. P. G. Fairweather, Mar. Freshw. Res. 42, 555 (1991). 6. J. Weinberg et al., Mar. Biol. 93, 305 (1986). 7. D. Simberloff et al., Nature 475, 36 (2011). 8. J. Hayes, Ecotoxicol. Environ. Safety 14, 73 (1987). 10.1126/science.aag3065 Deciphering P values: Defining significance IN HIS POLICY Forum “Aligning statistical and scientific reasoning” (3 June, p. 1180), S. N. Goodman cau- tions against using P values to determine statistical significance in the absence of context, but he does not adequately define significance level, P value, and hypothesis. Significance level is decided before the test; it is the confidence the researcher deems necessary to reject the null hypothesis. The P value emerges from the test and (along with other evidence, as Goodman rightly notes) serves as a tool to make and interpret the statistical decision. A well-stated hypothesis describes a state of nature. It is either true or not true, not subject to probability. The phrase “probability the hypothesis is true” is meaningless. One can only say, “likelihood that the observed data came from a population characterized by the hypothesis.” Statistical inference was the 20th century’s greatest contribution to epistemology. But all it means is that if one rejects a hypothesis at the 90% level, and if one were to repeat the test on 100 independent samples, then one would expect the same result approximately 90 times. Thus one or two replications, even conducted by different researchers, would not lead to firm knowledge. Fred Phillips Yuan Ze University, Zhongli District, Taoyuan City, 32003, Taiwan and Stony Brook University, New York, NY 11794, USA. Email: fred.phillips@stonybrook.edu 10.1126/science.aah4157 Expanding protected areas is not enough THE CONVENTION ON Biological Diversity (CBD) Aichi Target 11 calls for a substan- tial expansion in terrestrial and marine protected areas by 2020, but the change may not be sufficient to meet its intended conservation goals. Although protected areas can be effective conservation tools (1), many fail to halt species decline (2). All global and local conservation decisions must be underpinned by comprehensive and strategic evaluation of the tangible benefits of protected areas for species conservation. Such an evaluation system must be targeted, institutionally embedded, and scientifically credible (with controls, counterfactuals, replication, and standard methods). It will Edited by Jennifer Sills LETTERS 5 AUGUST 2016 • VOL 353 ISSUE 6299 551 SCIENCE sciencemag.org IMAGES: SEKULICN/ISTOCKPHOTO.COM One or two replications of a study result cannot definitively support or reject a hypothesis. Published by AAAS