Comparing Methods to Denote Treatment Outcome
in Clinical Research and Benchmarking Mental
Health Care
Edwin de Beurs,
1
*
Marko Barendregt,
1
Arco de Heer,
2
Erik van Duijn,
3
Bob Goeree,
4
Margot Kloos,
5
Kees Kooiman,
6
Helen Lionarons
7
and Andre Merks
8
1
SBG, Bilthoven, the Netherlands
2
Clinical Psychology, Leiden University, Leiden, the Netherlands
3
GGZ-Delfland, Delft, the Netherlands
4
Synaeda, Leeuwarden, the Netherlands
5
Propersona, Renkum, the Netherlands
6
Riagg Rijnmond, Vlaardingen, the Netherlands
7
Lionarons-GGZ, Heerlen, the Netherlands
8
Emergis, Goes, the Netherlands
Approaches based on continuous indicators (the size of the pre-to-post-test change; effect size or ΔT)
and on categorical indicators (Percentage Improvement and the Jacobson–Truax approach to Clinical
Significance) are evaluated to determine which has the best methodological and statistical characteris-
tics, and optimal performance, in comparing outcomes of treatment providers. Performance is compared
in two datasets from providers using the Brief Symptom Inventory or the Outcome Questionnaire.
Concordance of methods and their suitability to rank providers is assessed. Outcome indicators tend
to converge and lead to a similar ranking of institutes within each dataset. Statistically and conceptu-
ally, continuous outcome indicators are superior to categorical outcomes as change scores have more sta-
tistical power and allow for a ranking of providers at first glance. However, the Jacobson–Truax
approach can complement the change score approach as it presents outcome information in a clinically
meaningful manner. Copyright © 2015 John Wiley & Sons, Ltd.
Key Practitioners Messages:
• When comparing various indicators or treatment outcome, statistical considerations designate continu-
ous outcomes, such as the effect size of the pre–post change (effect size or ΔT) as the optimal choice.
• Expressing outcome in proportions of recovered, changed, unchanged or deteriorated patients has sup-
plementary value, as it is more easily interpreted and appreciated by clinicians, managerial staff and,
last but not the least, by patients.
• If categorical outcomes are used with small datasets, true differences in institutional performance may
get obscured due to diminished power to detect differences.
• With sufficient data, outcome according to continuous and categorical indicators converge and lead to
similar rankings of institutes’ performance.
Keywords: Treatment outcome, Effect Size, Percentage Inprovement (PI), Reliable Change Index (RCI),
Benchmarking
Since 2010, the mental healthcare field in the Netherlands
has embarked on a nationwide effort to collect outcome
data to support patient care and enable benchmarking of
treatment providers. The systematic collection of patient-
based data on the outcome of individual treatment is
called Routine Outcome Monitoring (ROM). Ideally,
ROM comprises a baseline assessment with a standardized
diagnostic interview, administration of rating scales and
self-report measures in combination with repeated assess-
ments of patients’ mental health and functioning (de Beurs
et al., 2011). The primary reason for collecting outcome
data routinely is that it can support individual therapy,
as these data provide feedback to the professional and
the patient about progress, or lack thereof (Lambert,
2007). When certain conditions are met, aggregated ROM
data may provide transparency regarding the effectiveness
of mental health care in everyday clinical practice and
allow for comparison of mental healthcare institutes.
*Correspondence to: Edwin de Beurs, SBG, Postbus 281, Bilthoven
3720 AG, the Netherlands.
E-mail: edwin.debeurs@sbggz.nl
Clinical Psychology and Psychotherapy
Clin. Psychol. Psychother. (2015)
Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/cpp.1954
Copyright © 2015 John Wiley & Sons, Ltd.