Safety and Reliability – Bedford & van Gelder (eds)
© 2003 Swets & Zeitlinger, Lisse, ISBN 90 5809 551 7
A non-parametric two-stage Bayesian model using Dirichlet distribution
C. Bunea, R.M. Cooke & T.A. Mazzuchi
Delft University of Technology, Delft, Netherlands
G. Becker
RISA, Berlin, Germany
ABSTRACT: As an alternative to standard two-stage Bayesian models, a non-parametric or Dirichelet two-
stage model is presented. The analytic solution of the model and its clear interpretation are its main advantages
over the classical models. A number of case study are simulated in order to check the robustness of the model.
Three data sets from German project – ZEDB are also used to compare the results of the Dirichelet model with
the results for standard one-stage and two-stage Bayesian models.
1 INTRODUCTION
Within the context of a recent review of a two-stage
Bayesian model for processing data at a population of
German nuclear plants, a nonparametric or Dirichelet
two stage model was developed. This model has some
advantages relative to the standard two stage models:
it is analytically solvable, no numerical integration
need to be performed, and it allows an intuitive inter-
pretation of the (hyper)parameters – so called “equiv-
alent observations”. We check the robustness of this
model with a simple numerical example. Preliminary
calculations show some sensitivity of the model with
regard to the number of cells that characterized the
prior distribution and their end points. For the pur-
poses of comparison with classical Bayesian model
[Vaurio, Hofer, Becker], the results for three data sets
are presented. Considering its apparent advantages,
the authors may recommend that the Dirichlet model
deserves further development, to qualify it for practi-
cal use in data base analysis.
2 BAYESIAN TWO STAGE HIERARCHICAL
MODELS
A two-stage model is really nothing more than a joint
distribution [Cooke et al 2002]. To be useful, however,
we must derive conditional distributions. Typically
we want to use data from “other plants” to make pre-
dictions about a given plant. This is very attractive in
cases where the data from the given plant is sparse.
By specifying the model assumptions one can
derive the posterior distribution P(l
0
|X
0
, … X
n
) for
failure rate l
0
at plant of interest 0, given X
i
failures
and T
i
observation times at plant i, i =0, 1, … n. First,
we can identify the conditional independence
assumptions in order to factor the joint distribution.
The conditional independence assumptions met in the
literature, with one possible exception [Hofer et al
1997][Hofer 1999], are stated below:
CI.1 Given Q, l
i
is independent of {X
j
, l
j
}
ji
CI.2 Given l
i
, X
i
is independent of {Q, l
j
, X
j
}
ji
,
where Q is the hyperparameter of the prior distribu-
tion from which the Poisson intensities l
1
… l
n
are
drawn.
The expression “X
i
is independent of {Q, l
j
, X
j
}
ji
”
entails that X
i
is independent of Q, and X
i
is independ-
ent of l
j
.
Giving the conditional independence assumptions,
[Cooke et al 2002] derived the explicit form of the pos-
terior distribution P(l
0
|X
0
, … X
n
) for failure rate l
0
:
(1)
Assumptions must be made also regarding the
fixed distribution types and the hyperprior Q. In the
two stage Bayesian models considered here, the like-
lihood of the failure times from each plant i, P(X
i
,
T
i
|l
i
), given l
i
and given any information from other
331