AIM-91 -3736-Cp Design and Assessment of High Performance Fault - Tolerant Digital Systems Carl R. Elks NASA-Langley Research Center Steven D. Young NASA-Langley Research Center JRPO AVRADA/AVSCOM Systems Architecture Branch Hampton, VA Hampton, VA Center for Digital Systems Research Research Triangle Institute Raleigh, NC Bob Baker /+% - MOL Introduction Abstract: As multiprocessor and parallel processing systems take on the responsibility for critical mis- formance and reliability behavior early in the de- mercial applications have evolved over the years from sign process becomes imperative to the success ofthe centralized, single computer systems to high through- project. This is particularly true of fault-tolerant sys- pu t distributed systems. This has become neces- tems. Fault-tolerant system design is complex and in- sary to meet the increasing performance demands tricate, with little room for uncertainty in the design upon current and future on-board cornput- process. Traditional design and evaluation methods ers to (1) PrOCeSS, in real-time, large amounts of i n - have been shown to be impractical and insufficient. f ormat i on and (2) to exploit the potential vehicle This results in sub-optimal designs and unacceptable performance benefits which can from integrat- development costs. ing control systems such as propulsion and airframe. Thus, there is a paramount need for the system de- Projected applications for these systems range from signer to have at his disposal a plausible design para- real-time adaptive control for flight performance aug- digm and supporting toolset to assist him in evalu- mentation to linear programming techniques for opti- ating the impact of design decisions in an efficient mal guidance and trajectory Path Planning [I]. From and effective manner. The underlying theme to this an architectural point of view, these diverse mission paradigm is "design for validation." That is, the ~ys- requirements lead to complex non-uniform architec- tern designer is constrained to make design decisions tures that must be efficiently and reliably intercon- from the beginning that enable verification and Val- nected. In addition to these demanding performance idation of the design during the design and assess and mission capability requirements, high depend- merit procw. This paper outlines a candidate design ability is required to extend operational life as system and assessment methodology that is targeted for the resources fail and to support critical mission Phases early to mid-level phases of the design cycle. This where system operation is imperative. High depend- method allows alternate architectural configurations ability mandates the use of fault tolerance and fault to be effectivelyassessed to evaluate the effects of de- avoidance techniques. It is expected that these tech- sign changes and to reduce the number of inherent niques Will become more complex as systems evolve design errors. toward distributed and parallel architectures. 1 sions, the ability to accurately Predict system Per- Avionics and aerospace systems for military and corn- Index Terms: reliability, fault tolerance, validation. In the Past, system designers have relied heavily on empirical and ad-hoc methods for evaluating the per- formance and reliability of fault-tolerant systems. This approach has produced systems that have not always worked as originally intended, and on some occa- sions, sowed the seeds for serious accidents in life- "Copyright c1991 by the American Institute of Aeronautics and Astronautics, Inc. No copyright is asserted in the United States under Title 17, U.S. Code. The U.S. Government has a royalty-free license to exercise all rights under the copyright claimed herein for Governmental purposes. All other rights are reserved by the copyright owner." 205