1 Copyright © 2007 by ASME Proceedings of GT2007: Turbo Expo 2007: Power for Land, Sea and Air May 14-17, 2007, Montreal, Canada GT2007-28343 DATA VISUALIZATION, DATA REDUCTION AND CLASSIFIER FUSION FOR INTELLIGENT FAULT DETECTION AND DIAGNOSIS IN GAS TURBINE ENGINES William Donat, Kihoon Choi, Woosun An, Satnam Singh, Krishna Pattipati University of Connecticut Storrs, CT 06268, USA krishna@engr.uconn.edu ABSTRACT In this paper, we investigate four key issues associated with data-driven approaches for fault classification using the Pratt and Whitney commercial dual-spool turbofan engine data as a test case. The four issues considered here include: (1) Can we characterize, a priori, the difficulty of fault classification via self-organizing maps? (2) Do data reduction techniques improve fault classification performance and enable the implementation of data-driven classification techniques in memory-constrained digital electronic control units (DECUs)?, (3) When does adaptive boosting, an incremental fusion method that successively combines moderately inaccurate classifiers into accurate ones, help improve classification performance?, and (4) How to synthesize classifier fusion architectures to improve the overall diagnostic accuracy? The classifiers studied in this paper are the support vector machine (SVM), probabilistic neural network (PNN), k-nearest neighbor (KNN), principal component analysis (PCA), Gaussian mixture models (GMM), and a physics-based single fault isolator (SFI). As these algorithms operate on large volumes of data and are generally computationally expensive, we reduce the dataset using the multi-way partial least squares (MPLS) method. This has the added benefits of improved diagnostic accuracy and smaller memory requirements. The performance of the moderately inaccurate classifiers is improved using adaptive boosting (AdaBoost). These results are compared to the results of the classifiers alone, as well as different fusion architectures. We show that fusion reduces the variability in diagnostic accuracy, and is most useful when combining moderately inaccurate classifiers. INTRODUCTION Safety-critical systems, such as gas turbine engines, demand real-time fault detection and isolation (FDI), and a decision support system to prescribe corrective actions so that the system can continue to function without jeopardizing the safety of the personnel and equipment involved. Owing to a large number of failure modes, substantial number of operating modes and possible occurrence of multiple faults simultaneously, FDI in complex safety-critical systems is a formidable challenge. Engine health-monitoring methods can be classified as being associated with one or more of the following three approaches: model-based, knowledge-based, or data-driven. The model- based FDI has progressed significantly over the last four decades. In this approach, a mathematical model for FDI is developed from the underlying physics and dynamics of the mechanical system. The knowledge-based approach, on the other hand, uses qualitative models (e.g., cause-effect graphs) to develop monitoring methods, and is suited in situations where mathematical models are not readily available. What if a mathematical model (model-based) or cause-effect graph model of system failures and their manifestations (knowledge-based) is not available? The Data-driven approach to FDI is an alternative, provided that system monitoring data is available. Due to its simplicity and adaptability, customization of a data- driven approach does not require an in-depth knowledge of the system. In this paper, we will employ SVM, PNN, KNN, PCA, GMM and SFI classifiers to investigate four key issues: visual characterization of the degree of difficulty in fault classification, data reduction for improved classification accuracy and real-time implementation, when to use adaptive boosting, and synthesizing fusion architectures.