Expert Systems With Applications 170 (2021) 114560 Available online 5 January 2021 0957-4174/© 2021 Elsevier Ltd. All rights reserved. A novel association rule mining method for the identifcation of rare functional dependencies in Complex Technical Infrastructures from alarm data Federico Antonello a , Piero Baraldi a, * , Ahmed Shokry a, b , Enrico Zio a, c, d , Ugo Gentile e , Luigi Serio e a Energy Department, Politecnico di Milano, Via Lambruschini 4, 20156 Milan, Italy b Center for Applied Mathematics, Ecole Polytechnique, Route de Saclay, 91120 Palaiseau, France c MINES ParisTech, PSL Research University, CRC, Sophia Antipolis, France d Eminent Scholar, Department of Nuclear Engineering, College of Engineering, Kyung Hee University, Republic of Korea e CERN, 1211 Geneva 23, Switzerland A R T I C L E INFO Keywords: Complex Technical Infrastructures Rare functional dependencies Association rules Alarm data Abnormal behaviors ABSTRACT This work presents a data-driven method for identifying rare functional dependencies among components of different systems of Complex Technical Infrastructures (CTIs) from large-scale databases of alarm messages. It is based on the representation of the alarm data in a binary form, the use of a novel association rule mining al- gorithm properly tailored for discovering rare dependencies among components of different systems and on the identifcation of groups of functionally dependent components. The proposed method is applied to a synthetic alarm database generated by a simulated CTI model and to a real large-scale database of alarms collected in the CTI of CERN (European Organization for Nuclear Research). The obtained results show the effectiveness of the proposed method. 1. Introduction The analysis of the vulnerability and resilience of Complex Technical Infrastructures (CTIs) based on expert knowledge, frst principle models and/or design documentations is very diffcult and in most cases unat- tainable (Sage & Cuppan, 2001). Particularly, the identifcation of the functional dependencies, which play a crucial role for both the vulner- ability and resilience of CTIs, is hard to be done with classical methods of system decomposition and logic analysis (Zio, 2016; Rebello, Hongyang, & Ma, 2018). Alternatively, in the current Industry 4.0 era, with its digitalization developments, the analysis of complex and large-scale systems like CTIs, can greatly beneft from the large amount of data, including monitored signals and alarms, collected on the components and systems thanks to the recent advancements of sensors, data acquisition and monitoring technologies (Serio et al., 2018; Antonello et al., 2019a). Specifcally, for alarms, Association Rule Mining (ARM) techniques have been developed to extract from alarm databases hidden knowledge and information about system behavior. This knowledge is typically captured in the form of rules describing the conditional occurrence of malfunctions, abnormal behaviors or failures detected and alarmed (Klemettinen et al., 1999a, 1999b; Amani, Fathi, & Dehghan, 2005; Han, Kim, & Sohn, 2009; Lozonavu, Vlachou-Konchylaki, & Huang, 2016). In this context, the methods of association rule mining have been typically developed for identifying temporal and/or spatial patterns in sets of alarms with the aim of fault isolation and root cause analysis, without particular focus on the identifcation of functional dependencies among compo- nents, whose knowledge is relevant to understand vulnerabilities and deploy resilience. For this reason, Antonello et al. (2019a) have pro- posed the use of ARM for the identifcation of functional dependencies among CTI components. Specifcally, the ARM method proposed relies on an Apriori-based algorithm that mines alarm databases to extract patterns of alarms which occur together frequently and in a short period of time and derives the association rules among them. Apriori-based mining algorithms employ a level-wise iterative search mechanism, which scans the whole database for identifying frequent patterns and drives the search for other frequentpatterns (Srikant and * Corresponding author. E-mail addresses: federico.antonello@polimi.it (F. Antonello), piero.baraldi@polimi.it (P. Baraldi), ahmed.shokry@polimi.it (A. Shokry), enrico.zio@polimi.it (E. Zio), ugo.gentile@cern.ch (U. Gentile), Luigi.Serio@cern.ch (L. Serio). Contents lists available at ScienceDirect Expert Systems With Applications journal homepage: www.elsevier.com/locate/eswa https://doi.org/10.1016/j.eswa.2021.114560 Received 17 April 2020; Received in revised form 31 December 2020; Accepted 31 December 2020