Preliminary analyses of prospective predictive validity of the HCR-20 V3 and the HARM: Assessing violence risk in forensic psychiatric inpatients Alana N. Cook 1,2,3 , Heather M. Moulden 1,2 , Mini Mamak 1,2 , Gary Chaimowitz 1,2 , Shams Lalani 4 , Katrina Messina 2 , & Yan L. Lim 3 1 Department of Psychiatry & Behavioural Neurosciences, McMaster University, 2 Forensic Psychiatry Service, St. Joseph’s Healthcare, Hamilton, 3 Department of Psychology, Simon Fraser University, 4 Department of Psychology, Neuroscience, & Behaviour, McMaster University Contact: cooka@stjoes.ca REFERENCES Chaimowtiz, G. & Mamak, M. (2011). Companion Guide to the Aggressive Incidents Scale and the Hamilton Anatomy of Risk Management (HARM). Hamilton, Canada: St. Joseph’s Healthcare, Hamilton. Douglas, K. D., Hart, S. D., Webster, C., & Belfrage, H. (2013). HCR-20V3: Assessing risk for violence. Burnaby, Canada: Mental Health Law and Policy Institute. Swets, J. A. (1988). Measuring the accuracy of diagnostic systems. Science, 240, 1285-1293. doi: 10.1126/science.3287615 Wilson, C., M., Desmarais, S. L., Nicholls, T. L., Hart, S. D., & Brink, J (2014). Predictive validity of dynamic factors: Assessing violence risk in forensic psychiatric inpatients. Behavioral Sciences and the Law. Yudofsky, S. C., Silver, J. M., Jackson, W., Endicott, J., & Williams, D. W. (1986). The Overt Aggression Scale for the objective rating of verbal and physical aggression. American Journal of Psychiatry, 143, 35-39. Figure 1. ROC Curve, HCR Total Scores at A1 for any violence in 3-month FU (AUC = .78 [95% CI, .63-.93], p < . 01) Assessment Period HCR-20 Scale Any Aggression (3-month FU) FU1 (n =21) FU2 (n=12) FU3 (n=7) A1 H .64 [.43-.86] .61 [.41-.80] C .68 [.48-.87] .51 [.32-.70] R .57 [.38-.76] .57 [.38-.76] C+R .66 [.49-.85] .55 [.36-.75] Total Score .78** [.63-.93] .66 [.48-.85] 1- month SRR .60 [.39-.80] .48 [.29-.68] 3- month SRR .63 [.44-.84] .57[.37-.76] A2 H .52 [.32-.72] C .87**[.74-.99] R .78* [.60-.95] C+R .82**[.67-.97] Total Score .73* [.54-.91] 1- month SRR .78** [.63-.94] 3- month SRR .62 [.42-.82] A3 H .69 [.44-.95] C .65 [.41-.90] R .73 [.48-.99] C+R .72 [.46-.98] Total Score .64 [41-.87] 1- month SRR .57 [.31-.83] 3- month SRR .58 [.32-.84] Assessment Period HARM Scale Any Aggression (3-month FU) FU1 (n =21) FU2 (n=12) FU3 (n=7) A1 H .54 [.34-.75] .61 [.42-.81] D .56 [.37-.76] .56 [.38-.76] Total Score .61 [.42-.81] .59 [.39-.80] SRR Immediate .63 [.44-.84] .56 [.37-.75] SRR Short-Term .67 [.47-.87] .60 [.40-.79] A2 H .48 [.28-.69] D .66 [.47-.85] Total Score .70 [.51-.89] SRR Immediate .80**[.64-.97] SRR Short-Term .83**[.68-.98] A3 H .48 [.21-73] D .57 [.33-.81] Total Score .54 [.30-.79] SRR Immediate .56 [.32-.81] SRR Short-Term .52 [.27-.77] BACKGROUND: As we move away from coercive intervention in mental health, there is increasing emphasis on the accurate assessment and response to potential violence based on dynamic risk – or changeable risk. While recent studies provide evidence supporting the role of dynamic factors in predicting violence (Wilson et al., 2013), there is a need to replicate and extend this research. The present study evaluated predictive validity of two risk assessment guidelines that were developed to assess risk for future violence: the new version of the HCR-20, version 3 (Douglas et al., 2013), which is the most widely used violence risk assessment tool internationally, and the Hamilton Anatomy of Risk Management (HARM; Chaimowitz & Mamak, 2011), which is a new instrument developed and implemented at St. Joseph’s Healthcare Hamilton and used at other hospitals and community organization in Canada. Static and dynamic risk factors and summary risk judgments (as measured by the HCR-20 V3 and the HARM) were made monthly for all eligible participants (N=39; 85% male) resident on general and secure units of a large forensic psychiatry program in Southern Ontario. Participants were all under the jurisdiction of the Ontario Review Board and had been found Unit to Stand Trial (Criminal Code of Canada, Section 2) or Not Criminally Responsible by Reason of Mental Disorder (NCRMD, Criminal Code of Canada, Section 16). The HCR-20 V3 was coded monthly by trained raters using a review of participants’ medical records. Raters made judgments on the presence and relevance of HCR-20 V3 risk factors across the three scales (Historical, HCR H; Clinical, HCR C; Risk Management, HCR R) and summary risk ratings (HCR SRRs) for risk for future violence (1- and 3-month trajectories, rated as low, moderate, or high). Interrater reliability for a random selection of the HCR-20 V3 SRR for risk for future violence for 1- and 3-months were both excellent (n = 11, 1 month ICC 1 = .80, 3-month ICC 1 = .80; both p < .001). The HARM was coded monthly by the clinical team for the presence of Historical (HARM H) and Dynamic (HARM D) risk factors (n = 14) and summary risk ratings (HARM SRRs) for risk for future violence, both with and without professional supports in place (low, moderate, high). Violent outcomes were assessed prospectively using monthly ratings of a modified version of the Overt Aggression Scale (OAS; Yudofsky et al., 1986), specifically the outcome included physical aggression toward persons or objects or sexually inappropriate behaviour. Data collection for this project is ongoing, only preliminary results for three assessment periods (A1-A3) and three subsequent 1-month follow-up (FU1-FU3) period are reported. DATA ANALYSIS: Predictive variables were historical factors (HCR H and HARM H) and the dynamic factors of the HCR-20 V3 and HARM (HCR-20 V3 C, R, C+R, Total Scores, and SRRs; HARM, D, Total Scores, and SRRs). Prior to predictive analyses, a-priori assumptions were evaluated to ensure that comparisons of predictor scores/ratings and outcomes were significantly different. For the 3-month outcome, the only significant difference between outcome groups was the HCR-20 V3 Total Numerical Scores (p < .05). For the 1-month outcomes periods, significant differences were only present during A2/FU2 for HCR C, R, C+R, and HCR Total Numerical Scores (all p < .01) and HCR 1-month SRRs, HARM Imminent Risk With Professional Support SRRs, and HARM Short-Term Risk With Professional Support SRRs (all p < .01). Power analysis also indicated that we need about 60 positive violent cases across the follow-up periods to have power to detect medium effects with ROC analyses (currently the data include 30 positive violent cases across the three time points). We examined predictive validity of the significant variables of the HCR-20 V3 and the HARM by conducting Receiver Operating Characteristics (ROC) analyses. Area Under the Curve (AUC) values were interpreted using Swets (1988) criteria: values between .50-.70 indicate poor predictive accuracy, .70-.90 indicate good predictive accuracy, and values greater than .90 indicate excellent accuracy. RESULTS: First, we examined whether assessments of historical and dynamic factors (HCR Total Scores) at A1 predicted institutional violence during the full 3-month follow-up period (see Figure 1). The AUC for the HCR-20 V3 Total Numerical Score was significant, .78 [95% CI, .63-.93], p < .01. Next, we ran separate ROC analyses to investigate whether A2 scores and final risk judgments (HCR C, R, C+R, Total Scores, 1- month SRRs; HARM Imminent SRRs, Short- Term SRRs) predicted institutional violence during corresponding 1-month follow-up period. The ROC analyses produced a significant AUC values indicating good predictive accuracy for each of the significant a-priori predictors examined, AUCs ranged from .73-.87, all p < .05. The results of the ROC analyses for the HCR-20 V3 and HARM are presented in Table 1 and Table 2, respectively. Table 1 AUC [95% CI] for HCR-20 V3 Scores and Risk Estimates Predicting Physical Aggression Toward Persons or Objects or Sexually Inappropriate Behaviour (N = 30, across time periods) Note: H = historical risk factors, C = clinical risk factors, R = risk management factors; Results in grey font did not meet a-priori assumptions and were not expected to be significant; SSR = HCR-20 V3 Summary Risk Rating for Case Priority/Risk for Violence (low, moderate, high); * = p < .05, ** = p < .001 Table 2 AUC [95% CI] for HARM Scores and Risk Estimates Predicting Physical Aggression Toward Persons of Objects or Sexually Inappropriate Behaviour (N = 30, across time periods) Note: H = historical risk factors, D = dynamic risk factors; Results in grey font did not meet a-priori assumptions and were not expected to be significant; SRR = HARM Summary Risk Rating for risk for violence with professional support (low, moderate, high); Imminent = days to weeks, Short-term = weeks to months; ** = p < .001 DISCUSSION: Our data only included 3 of 12 planned month follow-up assessment periods, thus the analyses we were able to conduct and the results are limited. Despite this limitation, the results indicated that the assessments were variably predictive of violence in the full 3-month follow-up period and corresponding 1-month follow-up periods. There are three possible interpretations for the present findings. One, the risk assessment tools are not valid, that is the tools do not accurately predict future violence in our sample. Two, the risk assessors are not making valid judgments using the tools. Or, three, the risk assessment tools and judgments made are valid and the clinical team is accurately recognizing violence risk and is effectively managing or preventing violence from occurring in the follow-up periods. The current analysis does not include management of violence risk as a mediator of violent outcomes. As part of our project we have also collected the number and type of management strategies in place at each assessment period. We plan on conducting analyses that allow us to assess for management as a mediator of violence risk to test the third possible interpretation of these findings. We also plan on conducting incremental validity analysis of the dynamic risk factors over historical risk factors over time and sensitivity of the HCR and HARM over time, particularly because the HCR is not intended to assess risk for violence in a short-time period (i.e., 1-month). Our definition of violence in the current study was also broad, including all intimidating, threatening, or actual violent acts. We also plan on conducting predictive analyses using physical harm only as an outcome to compare the predictive accuracy of the tools for broad and narrow definitions of violence. At this time, the physical harm was only present in four instances, which prevented us from running these analyses at this time.