Violence Detection from Emergency Room Reports Lorenzo Caresio, Matteo Delsanto, Calogero Jerik Scozzaro, Enrico Mensa Computer Science Department University of Turin Turin, Italy name(s).surname@unito.it Davide Colla Independent Researcher Turin, Italy davi.colla@gmail.com Carlo Mamo Servizio Sovrazonale di Epidemiologia ASL TO3 Turin, Italy carlo.mamo@epi.piemonte.it Alessio Pitidis Reparto Epidemiologia Ambientale e Sociale Istituto Superiore di Sanit` a Rome, Italy alessio.pitidis@iss.it Arianna Vitale Direzione sanitaria di Presidio Ospedaliero A.O. Ordine Mauriziano di Torino Turin, Italy avitale@mauriziano.it Daniele P. Radicioni Computer Science Department University of Turin Turin, Italy daniele.radicioni@unito.it Abstract—This paper presents a work to discriminate emer- gency room reports containing violent injuries from those whose injuries are caused by other factors. Real-word clinical narratives from emergency room reports are analyzed. We report the results obtained by experimenting with multiple architectures and assess their sustainability in settings with limited computational resources and time constraints. Our best models showed to be robust to medical and clinical language, and to differences in reporting practices adopted by different hospitals, exhibiting high accuracy and running times suitable for implementation in real settings: in the violence detection task our system revealed thousands of records not previously annotated as containing in- juries of violent origin; in the binary categorization task our best performing models obtained 97.7% F1 score; in the multiclass categorization task a 74.6% average F1 score in the categorization of violence perpetrators was found. Although further efforts are necessary to enable automatic systems to actively contribute to public health monitoring and clinical intervention, the obtained results can help bridge scientiﬁc research and everyday clinical practice. Index Terms—Violence detection, emergency room data, clini- cal narratives, privacy in medical data, transformer architecture. I. I NTRODUCTION In this work we analyze free-text clinical narratives from Italian Emergency Room Reports (ERRs) and classify such records based on whether the injuries therein originate from violence events or not. As is widely acknowledged, violence against vulnerable people (mostly women, children and el- derly) is a serious societal concern in most countries. It is known that such violence is to a good extent underes- timated for many reasons, including the high pressure on healthcare workers who do not have the time to properly record data; poorly functional interfaces; the reluctance of victims of violence to report events often involving relatives or acquaintances, etc. [1]–[4]. Furthermore, violence episodes are often repeated with increasing intensity and brutality (e.g., towards women), so that they have been found to be predictors of femicide [5]. Tracking violence is a relevant task: being able to individuate injuries originated from violence events may provide precious cues to connect injuries with their causes, which is essential for countering violence. It would also allow to devise alerting tools and procedures, and to collect descriptive statistics on socially relevant phenomena that are underestimated and associated to huge cost for health systems and negatively impact on people’s daily life. We collected as many as 700k ERRs. Provided that the lack of data is known in general to afﬂict the medical domain (e.g., by effect of the GDPR and other privacy policies) [6], [7], to the best of our knowledge the present work is the largest systematic analysis ever carried out in Italy focused on automatically recognizing violent records. In this study we introduce the results obtained while categorizing such data with several neural architectures that allow for inference in nearly real time to discriminate violence-related injuries (V) from non-violent ones (NV). Our best models obtained high accuracy in detecting violent records and in categorizing violence perpetrators, and can thus be deployed in healthcare facilities as a core component to assist the medical staff through a reliable and inexpensive alerting mechanism, allow- ing to bridge scientiﬁc research and clinical practice. II. RELATED WORK The automatic classiﬁcation of clinical narratives, particu- larly to detect violence-related incidents in Emergency Room Reports intersect with various research areas, including event extraction, text classiﬁcation and domain-speciﬁc approaches in healthcare settings. Historically, event extraction has played a central role in systems that identify structured information from unstructured clinical texts [8], at times in conjunction with knowledge-intensive lexical resources [9] and medical thesauri [10]. Recent work in this area relied on pre-trained language models such as BERT to extract events via trigger and augmented identiﬁcation, as exempliﬁed by PLMEE [11]. Such approaches framed trigger detection as a classiﬁcation problem treating each token as either a potential initiator of