IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 58, NO. 1, FEBRUARY 2011 277 Anomaly Detection in Nuclear Power Plants via Symbolic Dynamic Filtering Xin Jin, Student Member, IEEE, Yin Guo, Soumik Sarkar, Student Member, IEEE, Asok Ray, Fellow, IEEE, and Robert M. Edwards Abstract—Tools of sensor-data-driven anomaly detection facil- itate condition monitoring of dynamical systems especially if the physics-based models are either inadequate or unavailable. Along this line, symbolic dynamic filtering (SDF) has been reported in lit- erature as a real-time data-driven tool of feature extraction for pat- tern identification from sensor time series. However, an inherent difficulty for a data-driven tool is that the quality of detection may drastically suffer in the event of sensor degradation. This paper proposes an anomaly detection algorithm for condition monitoring of nuclear power plants, where symbolic feature extraction and the associated pattern classification are optimized by appropriate par- titioning of (possibly noise-contaminated) sensor time series. In this process, the system anomaly signatures are identified by masking the sensor degradation signatures. The proposed anomaly detec- tion methodology is validated on the International Reactor Innova- tive & Secure (IRIS) simulator of nuclear power plants, and its per- formance is evaluated by comparison with that of principal com- ponent analysis (PCA). Index Terms—Data-driven fault detection, feature extraction, pattern classification, symbolic dynamics, time series analysis. I. INTRODUCTION C ONDITION monitoring and timely detection of incipient faults are critical for operational safety and performance enhancement of nuclear power plants. There are various sources of anomalous behavior (i.e., deviation from the nominal condi- tion) in plant operations, which could be the consequence of a fault in a single component or simultaneous faults in multiple components. Often it is difficult for the plant operator to detect the anomaly and locate the associated anomalous component(s), especially if the anomaly is small and evolve slowly. Upon oc- currence of an anomalous event and subsequent pervasion of its effects, the operator could be overwhelmed by the sheer volume of information, generated simultaneously from various sources. Manuscript received August 17, 2010; revised October 01, 2010; accepted October 10, 2010. Date of publication December 03, 2010; date of current ver- sion February 09, 2011. This work was supported in part by the U.S. Department of Energy under NERI-C Grant DE-FG07-07ID14895 and by NASA under Co- operative Agreement NNX07AK49A. Any opinions, findings and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the sponsoring agencies. The authors are with the Department of Mechanical and Nuclear Engi- neering, Pennsylvania State University, University Park, PA 16802 USA (e-mail: xuj103@psu.edu; yxg141@psu.edu; szs200@psu.edu; axr2@psu.edu; rmenuc@engr.psu.edu). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNS.2010.2088138 Therefore, it would be beneficial to develop an automated con- dition monitoring system to assist the plant operator to detect the anomalies and isolate the anomalous components. Condition monitoring algorithms are primarily divided into two different categories, namely, model-based and data-driven. Both model-based and data-driven techniques have been re- ported in literature for condition monitoring of nuclear power plants. Examples of model-based condition monitoring can be found in [1], [2]. Among data-driven tools, neural networks (NN) and principal component analysis (PCA)-based tools [3]–[6] are most popular. Although model-based techniques have their advantages in terms of physical interpretation, their reliability and computa- tional efficiency for condition monitoring often decrease as the system complexity increases. On the other hand, data-driven techniques are expected to remain largely reliable and compu- tationally efficient in spite of increased system complexity if the goal is to monitor the input-output information from an en- semble of (appropriately calibrated) sensors while considering the entire system as a black-box. However, unless the ensemble of acquired information is systematically handled, data-driven techniques may become computationally intensive and the per- formance of condition monitoring may deteriorate due to sensor degradation. Furthermore, data-driven techniques would require high volume of training data (e.g., component malfunction data in the present context). A problem with handling time series data is its volume and the associated computational complexity; therefore, the avail- able information must be appropriately compressed via trans- formation of high-dimensional data sets into low-dimensional features with minimal loss of class separability. In our previous work [7], we reported Symbolic Dynamic Filtering (SDF) for detection of anomalies (i.e., deviations from the nominal con- dition) in dynamical systems. The SDF method is shown to be useful for feature extraction from time series and has been ex- perimentally validated for real-time execution in different ap- plications (e.g., electronic circuits [8] and fatigue damage mon- itoring in polycrystalline alloys [9]). A major challenge in any sensor-data-driven detection tool is to identify the actual anomaly in the system in the presence of sensor degradation (e.g., drift and noise) without succumbing to a large number of false alarms or missed detections. The sit- uation becomes even more critical if the control system uses observations from the degraded sensors as feed-back signals and thereby distorts the control inputs. Traditionally, redundant sensors along with methods based on analytic redundancy have been used for sensor anomaly identification [10], [11]. 0018-9499/$26.00 © 2010 IEEE