CGR-02-05 Failure Diagnosis of Discrete Event Systems: The Case of Intermittent Faults Olivier Contant, St´ ephane Lafortune, and Demosthenis Teneketzis Department of Electrical Engineering and Computer Science, The University of Michigan, 1301 Beal Avenue, Ann Arbor, MI 48109–2122 USA {olivier, stephane, teneket}@eecs.umich.edu; http://www.eecs.umich.edu/umdes/ August 28, 2002 Abstract The diagnosis of “intermittent” faults in dynamic systems modeled as discrete event systems is considered. Earlier work on failure diagnosis of discrete event systems assumed permanent faults (or failures). In many systems, faulty behavior often occurs intermittently, with fault events followed by corresponding “reset” events for these faults, followed by new occurrences of fault events, and so forth. Since these events are usually unobservable, it is necessary to develop diagnostic methodologies for intermittent faults. Prior methodologies for detection and isolation of permanent faults are no longer adequate in the context of intermittent faults, since they do not account explicitly for the dynamic behavior of these faults. This paper addresses this issue by: (i) proposing a modeling methodology for discrete event systems with intermittent faults; (ii) introducing new notions of diagnosability associated with fault and reset events; and (iii) developing necessary and sufficient conditions, in terms of the system model and the set of observable events, for these notions of diagnosability. The definitions of diagnosability are complementary and capture desired objectives regarding the detection and identification of faults, resets, and the current system status (namely, is the fault present or absent). The associated necessary and sufficient conditions are based upon the technique of “diagnosers” introduced in earlier work, albeit the structure of the diagnosers needs to be enhanced to capture the dynamic nature of faults in the system model. The diagnosability conditions are verifiable in polynomial time in the number of states of the diagnosers. 1 Introduction Practical experience has shown that detection and isolation of many classes of faults in dynamic systems can be approached as a problem of state estimation and inferencing for discrete event systems [1–4, 6, 9–21, 25–31, 33] The methodologies used in these applications assume that once faults occur, they remain in effect permanently; hence, the terminology “failures” is often used for these permanent faults. Similarly, to the best of our knowledge, diagnostic methodologies developed in the field of model-based reasoning in artifical intelligence (which are close in spirit to the discrete event systems methodologies, since they are also based on qualitative system models) are also geared towards the diagnosis of permanent faults; see, e.g., [7, 8, 22, 23, 32]. In many systems, faulty behavior often occurs intermittently, with fault events followed by corresponding “Reset” events for these faults, followed by new occurrences of fault events, and so forth. In hardware systems, intermittent faults are typically caused by bad electrical contacts (e.g., faulty relays), “sticky” components (e.g., 1