Psychological Assessment 1995, Vol. 7, No. 4,472-483 Copyright 1995 by the American Psychological Association, Inc. 1040-3590/95/$3.00 Reliability and Validity of the Hamilton Depression Inventory: A Paper-and-Pencil Version of the Hamilton Depression Rating Scale Clinical Interview William M. Reynolds University of British Columbia Kenneth A. Kobak University of Wisconsin—Madison A self-report, paper-and-pencil version of the Hamilton Depression Rating Scale (HDRS; M. Ham- ilton, 1960) was developed. This measure, the Hamilton Depression Inventory (HDI; W. M. Reyn- olds & K. A. Kobak, 1995) consists of a 23-item full form, a 17-item form, and a 9-item short form. The 17-item HDI form corresponds in content and scoring to the standard 17-item HDRS. With a sample of psychiatric outpatients with major depression ( n = 140), anxiety disorders (n = 99), and nonreferred community adults ( n = 118), the HDI forms demonstrated high levels of reliability ( r a = .91 to .94, r tt = .95 to .96). Extensive validity evidence was presented, including content, criterion- related, construct, and clinical efficacy of the HDI cutoff score. Overall, the data support the reli- ability and validity of the HDI as a self-report measure of severity of depression. Depression is one of the most prevalent mental health prob- lems in the United States (Kessler et al., 1994; Regier et al., 1988), with 1-month prevalence rates ranging from 2% to 3% for major depression and over 6% for any form of affective dis- order (Regier et al., 1993). The fourth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV; Ameri- can Psychiatric Association, 1994) reports a point-prevalence rate for major depression of between 5% and 9% for women and between 2% and 3% for men. For decades, mental health professionals have relied on semi- structured clinical interviews and self-report measures for the William M. Reynolds, Psychoeducational Research and Training Centre, University of British Columbia, Vancouver, British Columbia, Canada; Kenneth A. Kobak, Department of Counseling Psychology, University of Wisconsin—Madison. In our research with the Hamilton Depression Inventory and our ear- lier work on the computer-administered Hamilton Depression Rating Scale (HDRS) we have been assisted by a number of individuals. We are grateful to John H. Greist of the University of Wisconsin School of Medicine and the Dean Foundation for Health, Education and Re- search for his support. We are grateful to James W. Jefferson, David J. Katzelnick, and Robin L. Chene for providing diagnostic evaluations; and to Julie Mantle, Amy Rock, Mary Lokken, Barbara Woodhouse, Linda Harris, Todd Liolios, James Mazza, and Kathleen Matkowski for conducting interviews with the HDRS. We thank Chuck Pulvino from the Department of Counseling Psychology at the University of Wiscon- sin—Madison for providing staff support and facilities for the project. We also wish to express our appreciation to Margaret Reynolds for her assistance in the data coding and entry. An earlier version of this article was presented at the 102nd Annual Conference of the American Psy- chological Association, Los Angeles, California, August, 1994. Correspondence concerning this article should be addressed to Wil- liam M. Reynolds, Psychoeducational Research and Training Centre, University of British Columbia, 2125 Main Mall, Vancouver, British Columbia, Canada V6T 1Z4. Electronic mail may be sent via Internet to william_reynolds@mtsg.ubc.ca. identification of depression in adults. The use of these measures, most of which are considered severity measures of depression, does not provide for the formal diagnosis of depression. However, this does not limit their usefulness for the evaluation of the severity of depressive symptomatology in clinical and research applica- tions (Reynolds, 1994). The Hamilton Depression Rating Scale The Hamilton Depression Rating Scale (HDRS; Hamilton, 1960, 1967) was one of the first semistructured interview mea- sures developed for the clinical evaluation of severity of depres- sion in adults. The HDRS is one of the most frequently used clinical interview measures of the severity of depression (e.g., Edwards et al., 1984; Endicott, Cohen, Nee, Fleiss, & Sa- rantakos, 1981; Fava, Kellner, Munari, & Pavan, 1982; Sayer et al., 1993; Williams, 1988) and is often used as the criterion measure against which self-report measures of depression are validated (e.g., Carroll, Feinberg, Smouse, Rawson, & Greden, 1981; Montgomery & Asberg, 1979). Although frequently used in research, the relative lack of standardized administration instructions and scoring criteria for the HDRS has been problematic. Cicchetti and Prusoff (1983) in a study of interrater reliability of a 22-item version of the HDRS found low levels of reliability for individual items, with 14 of the 22 items demonstrating intraclass correlation co- efficients of less than .40. The lack of scoring guidelines has led a number of investigators (e.g., Endicott et al., 1981; Miller, Bishop, Norman, & Maddever, 1985; Williams, 1988) to de- velop item modifications and suggest administration and scor- ing procedures. The issues of training, scoring, and differences in version of the HDRS used in research have been evaluated and discussed by Hooijer et al. (1991), who found small but meaningful differences across HDRS versions and training. Several self-report versions of the HDRS have been developed by researchers, two of which were based on computer admin- 472