Revisiting the Reliability of Diagnostic Decisions in Sex Offender Civil Commitment Richard L. Packard, Jill S. Levenson Lynn University, Boca Raton, Florida, USA [Sexual Offender Treatment, Volume 1 (2006), Issue 3] Abstract Levenson (2004) investigated the inter-rater reliability of DSM-IV diagnoses commonly assessed by forensic evaluators in sex offender civil commitment proceedings and determined that the reliability of civil commitment selection (kappa = .54) and DSM-IV diagnostic categories (kappa = .23 - .70) were poor. The current study first reviews the limitations of using kappa in reliability studies and the reasons why the statistic may lead to paradoxical findings. Next, using Levenson’s data as a demonstration, alternative statistical analyses measuring raw proportions of agreement, odds and risk ratios, and estimated conditional probabilities were utilized to examine reliability. Agreement on the existence of the majority of the diagnosed disorders was rather high despite low values of kappa. The proportions of total agreement in diagnostic decisions ranged from 68% to 97%, indicating that, overall, civil commitment evaluations were a reliable process. The strengths and limitations of alternative methods of measuring inter-rater reliability are illustrated, and implications for policy and practice are discussed. Key words: sex offender, sexual predator, civil commitment, inter-rater reliability, DSM, diagnosis, kappa Author’s note: The authors wish to thank John Morin and Paul Stern for their reviews of an earlier draft. Their valuable suggestions helped strengthen the manuscript. Forensic examiners are often called upon to render opinions as to whether a person has a mental disorder. These questions are encountered in various legal contexts, including competency, criminal culpability, workers’ compensation, torts, sentencing, and psychiatric commitment (Melton, Petrila, Poythress, & Slobogin, 1997). Seventeen states have implemented civil commitment laws for sex offenders, by which sexually violent predators (SVP) can be involuntarily treated in secure facilities beyond their criminal sentence. Although not necessarily a legal requisite, evaluators typically use the nosology described in the Diagnostic and Statistical Manual for Mental Disorders, Fourth Edition, Text Revision (DSM-IV-TR) (American Psychiatric Association, 2000) for such purposes. Use of the DSM in forensic matters has been subject to criticism, particularly for a lack of data regarding reliability (Campbell, 1999;2004; Grove, Andreasen, McDonald-Scott, Keller, & Shapiro, 1981; Kirk & Kutchins, 1994; Meyer, 2002). The purpose of this study is to examine the strengths and weaknesses of different methods of assessing the reliability of diagnostic decisions within the forensic context of sex offender civil commitment. In order to illustrate these issues, we analyzed data from a recent study investigating the reliability of SVP civil commitment criteria (Levenson, 2004). Levenson’s data has some unique advantages for this purpose: it was derived from a large sample that was carefully assembled, it used field clinicians involved in real-world decisions rather than simulated cases, and it involved a forensic issue where the presence of a mental condition was required as part of the commitment criteria, thus obligating each clinician to specifically assess for mental illness using the DSM. SEXUAL OFFENDER TREATMENT: Richard L. Packard, Jill S. Levenson 1