Journal of Mathematical Sciences, Vol. 123, No. 1, 2004 ON RELIABILITY OF HIERARCHICAL SYSTEMS WITH GRADUAL FAILURES* B. Dimitrov (Flint, USA) and V. Rykov (Moscow, Russia) UDC 519.2 1. Introduction and Motivation In complex systems with controllable reliability, any complete system failure does not occur suddenly but usually is a result of an accumulation of a sequence of many gradual failures. This has stimulated consideration of systems with gradual failures of different types, or multi-state reliability systems (for an extensive recent bibliography, please see [1]). Two main characteristics are common in reliability studies: the lifetime of the system and its steady-state charac- teristics under some assumptions about the repair process. The ways of evaluating these characteristics depend on the approach to the following two aspects: probabilistic and structural. The probabilistic aspect deals with calculation of the probabilities of system states and uses them in reliability calculations. The structural aspect considers a kind of direct evaluation of reliability characteristics for any given structure of a particular system. In this paper, we propose a general approach to describe a model and evaluate the most common reliability characteristics of complex hierarchical systems with various types of gradual failures. Such failures may change the state of the system and the quality of its operation but do not necessarily lead to complete system failure. A special set of “failure states” of components of the system causes its complete failure. The reliability of the system is partially controllable. Various repair policies are possible after detection of the gradual failure of some part of the system: the whole system, only the failed element, or some structural part of the system can be repaired in this case. The case of repair of the whole system was considered in [3]. The case of repair of only a simple unit was proposed in [4]. In this paper, we deal with probabilistic aspects of modeling system reliability under both the policy of whole- system repair and that of only a simple unit repair and focus on both of its common characteristics. In the next section, a general model for describing reliability process in systems with gradual failures is proposed. An explicit study for a simple unit model is given in Sec. 3. Its extension to the general model under the whole-system repair policy is shown in Sec. 4. The general model with only unit repair policy is considered in Sec. 5. In Sec. 6, algorithms for calculation of the steady state and Laplace transform of time-dependent probabilities are given, and some examples are considered in Sec. 7. 2. A General Model Consider a complex, hierarchical, multi-component system subjected to gradual internal failures of different types. Assume that the system is constructed from blocks and branches of several levels (see Fig. 1). Each block and the line of branches and blocks following after forms a hierarchical subsystem of the same type as the main system. The blocks of the last (lowest) level will be referred to as units and may be subjected to gradual failures of its own type. Some special combination F of unit failures cause the whole system failure. We denote by L the lowest level of units, and it is not necessary that all units belong to this level. Allocation of units at different levels is possible. The reliability of the system is partially controllable. Various control and repair policies are available. For common control policy there exists only one control unit, while for separate control policy each unit is controlled separately. Three different types of repair policies are possible in the case of detection of a failure: • Whole-system renovation; • Only the failed unit renovation; • Structural renovation — renovation of some subsystem depending on the state of the system and location of the failed unit. In each of these cases, it is supposed that either the whole system, or only the failed unit, or some block is renewed. In the paper, we will consider only the case of common control policy with either the whole system, or only the failed unit renovation. *Partially supported by the RFBR (grant No. 01-07-90259). Proceedings of the International Seminar on Applied Stochastic Models and Information Processes, Petrozavodsk, Russia, 2002. 3802 1072-3374/04/1231-3802 c 2004 Plenum Publishing Corporation