Safety and Reliability – Bedford & van Gelder (eds) © 2003 Swets & Zeitlinger, Lisse, ISBN 90 5809 551 7 A non-parametric two-stage Bayesian model using Dirichlet distribution C. Bunea, R.M. Cooke & T.A. Mazzuchi Delft University of Technology, Delft, Netherlands G. Becker RISA, Berlin, Germany ABSTRACT: As an alternative to standard two-stage Bayesian models, a non-parametric or Dirichelet two- stage model is presented. The analytic solution of the model and its clear interpretation are its main advantages over the classical models. A number of case study are simulated in order to check the robustness of the model. Three data sets from German project – ZEDB are also used to compare the results of the Dirichelet model with the results for standard one-stage and two-stage Bayesian models. 1 INTRODUCTION Within the context of a recent review of a two-stage Bayesian model for processing data at a population of German nuclear plants, a nonparametric or Dirichelet two stage model was developed. This model has some advantages relative to the standard two stage models: it is analytically solvable, no numerical integration need to be performed, and it allows an intuitive inter- pretation of the (hyper)parameters – so called “equiv- alent observations”. We check the robustness of this model with a simple numerical example. Preliminary calculations show some sensitivity of the model with regard to the number of cells that characterized the prior distribution and their end points. For the pur- poses of comparison with classical Bayesian model [Vaurio, Hofer, Becker], the results for three data sets are presented. Considering its apparent advantages, the authors may recommend that the Dirichlet model deserves further development, to qualify it for practi- cal use in data base analysis. 2 BAYESIAN TWO STAGE HIERARCHICAL MODELS A two-stage model is really nothing more than a joint distribution [Cooke et al 2002]. To be useful, however, we must derive conditional distributions. Typically we want to use data from “other plants” to make pre- dictions about a given plant. This is very attractive in cases where the data from the given plant is sparse. By specifying the model assumptions one can derive the posterior distribution P(l 0 |X 0 , … X n ) for failure rate l 0 at plant of interest 0, given X i failures and T i observation times at plant i, i =0, 1, … n. First, we can identify the conditional independence assumptions in order to factor the joint distribution. The conditional independence assumptions met in the literature, with one possible exception [Hofer et al 1997][Hofer 1999], are stated below: CI.1 Given Q, l i is independent of {X j , l j } ji CI.2 Given l i , X i is independent of {Q, l j , X j } ji , where Q is the hyperparameter of the prior distribu- tion from which the Poisson intensities l 1 … l n are drawn. The expression “X i is independent of {Q, l j , X j } ji ” entails that X i is independent of Q, and X i is independ- ent of l j . Giving the conditional independence assumptions, [Cooke et al 2002] derived the explicit form of the pos- terior distribution P(l 0 |X 0 , … X n ) for failure rate l 0 : (1) Assumptions must be made also regarding the fixed distribution types and the hyperprior Q. In the two stage Bayesian models considered here, the like- lihood of the failure times from each plant i, P(X i , T i |l i ), given l i and given any information from other 331