REVIEW The trouble with numbers: Some fundamental flaws with using standardized outcome measures Brian Rodgers Counselling Programme, Auckland University of Technology. Auckland, New Zealand Correspondence Brian Rodgers, AUT University, South Campus, Private Bag 92006,Auckland 1142, New Zealand. Email: brian.rodgers@aut.ac.nz Abstract The modern paradigm of evidence‐based practice dominates the therapeutic world and influences all aspects of the profession. Yet this pervasive concept is based on surprisingly shaky ground. When looked at in detail, the source of the raw data used as the basis of much of this evidence, standardized outcome measures, can be seen to be fundamentally flawed. This article sets out the many method- ological, sociopolitical, and technical flaws in standardized outcome measures, and asks what this means for the field of psychotherapy. KEYWORDS critique, methodology, outcome measures, psychotherapy, sociopolitical 1 | INTRODUCTION Standardized outcome measures are at the heart of the evidence‐based practice paradigm. In this paradigm, such measures provide a standardized “ruler” which allows the objective comparison of results of the effectiveness of interventions not just within a study, but across multiple studies. It is only via this standardized comparison across multiple controlled studies that empirically supported treatments are identified (Chambless & Hollon, 1998). Without this standardized ruler, no such comparisons can be made. Within the fields of psychotherapy, counselling, psychology, and associated disciplines, such measures typically consist of a list of items in the form of questions, statements, or observations relating to a person's symptoms, behav- iour, functioning, well‐being, quality of life, etc. Each response to an item is assigned a numerical value, either a simple binary value (e.g., 1 = True, 0 = False) or using some sort of intensity scale (e.g., 0 = Never, through to 5 = Always). These values are then totalled according to a standardized schema to produce scores on one or more scales or dimensions (e.g., overall psychological distress, level of depression, functioning, etc.). Typically, a questionnaire is given to the client before therapy commences, then again some time later (usually at the end of therapy), and the change in scores is calculated to give a representation of the success or otherwise of the therapy. When combined across multiple studies, this elegant and straightforward system provides compelling evidence for the efficacy of a therapeutic approach or particular intervention. But what if the data that these claims are based on is fundamentally flawed? What if this “standardized ruler” is not as straight and linear as claimed? This article argues that the claims of standardization and objectivity of measurement within research on psychological therapies is grossly overstated, and that, when one looks in detail at the process of outcome measurement, significant flaws are DOI: 10.1002/ppi.1423 Psychother Politics Int. 2017;e1423. https://doi.org/10.1002/ppi.1423 Copyright © 2017 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/ppi 1 of 8