International Journal of Computers and Applications, Vol. 31, No. 4, 2009 TIME-CONSTRAINED FAULT TOLERANT X-BY-WIRE SYSTEMS P.M. Khilar * and S. Mahapatra ** Abstract This paper presents a software-based approach to the problem of distributed actuator failure diagnosis under resource and deadline constraints in fly-by-wire systems using comparison of actuator be- haviour and corresponding mathematical model. (2k + 1) replicated control and diagnosis tasks are executed to tolerate k-faults in the control and diagnosis functions. The proposed method has been evaluated through simulation based on real data and has been compared with a single rate steer-by-wire (SBW) systems recently suggested by Kandasamy et al. [1]. Key Words Safety critical systems, fault diagnosis, distributed embedded sys- tems, task scheduling, diagnosis latency 1. Introduction Distributed embedded systems, which comprise of proces- sors, smart sensors, and smart actuators interacting via a communication medium, are being widely used in safety critical systems such as fly-by-wire (FBW) system. The occurrence of transient faults (faults for small durations) at repeated time intervals in such systems lead to per- manent faults. The transient faults cause the system to work in a degraded fashion but do not cause catastrophic events to system components. On the other hand, per- manently faulty units cause the catastrophic events, thus endangering the related applications, which demand a high level of fault tolerance and performance under severe cost constraints. One possible solution is to mask failures by having sufficient physical redundancy such as N back up micro- controllers in Boeing 777 [2] and MARS [3] which does not affect the execution time of the working micro controllers. Also, the weight, cost, space and power consumption of most powerful up-to-date available microcontrollers is also less to adapt NMR approach as compared to an older gen- eration processor [4]. However, the power consumption of * Department of Computer Science & Engineering, NIT Rourkela – 769008; e-mail: khilarpm@yahoo.com ** Department of E & ECE, IIT, Kharagpur – 721 302; e-mail: sudipta@ece.iitkgp.ernet.in Recommended by Dr. D. Wang (paper no. 202-2391) electronically signalled sensors, actuators and processors used in FBW system are more. An alternative and better approach is to use software means to diagnose the criti- cal components and bypass faulty components during the system operation. A few active standby backup micro- controllers can be enabled by software to replace the per- manently faulty critical components. Here in this paper, the comparison-based diagnosis is followed to diagnose a permanently faulty actuator, to reach an agreement over the fault free units [5] and allow the recovery tasks to en- able the active standby nodes automatically to replace the faulty nodes. The comparison-based diagnosis can reduce the physical redundancy required by N-modular back-up microcontrollers [6]. The control surfaces and other components of a FBW system uses a number of actuators and processors to per- form the flight control functions such as steering, track- ing and adaptive cruise control during take-off, flight and landing time of the aircraft. Embedded systems used in an FBW applications for flight navigation and control employ microprocessor-controlled electromechanical actuators and networks without any mechanical backup. In these, pro- cessors measure the angle of attack, calculate the desired actuator deflection and then command the electromechan- ical actuators at the control surfaces on aircraft appropri- ately [7–9]. If an actuator delivers erroneous results due to electromechanical failure, this may result in undesir- able system-level behaviour; e.g., a faulty actuator may cause an unwanted runway deviation during flight landing. The timely diagnosis of faulty actuators under deadline and resource constraints can ensure a fault tolerant FBW system. The previous approach to distributed diagnosis mainly deals with diagnosis as a standalone objective with- out considering the normal system functions and the real- time behaviour of control applications [5, 10–21]. Most of the algorithms work offline and are not suitable for online embedded control applications. Recently, in [1], Kandasamy et al. have proposed the distributed failure diagnosis of actuators used in steer- by-wire (SBW) systems. Their work assumed a single rate (SR) system where the control functions such as steering, traction and cruise are of same period. However, their approach suffers from a number of limitations such as (i) the control and diagnosis tasks do not have any jitter left between them and therefore fail to accommodate 230