Metrika manuscript No. (will be inserted by the editor) Sample size determination of binomial data with the presence of misclassiﬁcation Athanassios Katsis Department of Social & Educational Policy University of Peloponnese 68 Dikearhou Street, 11636 Athens, Greece email: katsis@uop.gr Received: date / Revised version: date Abstract We propose a double sampling scheme with two classiﬁers to address the problem of optimal sample size when misclassiﬁcation among binomial observations is observed. The classiﬁers vary with respect to the classifying cost and precision. Furthermore, since the data are unknown, an additional constraint is set on the probability of observing “undesirable” data. The method is developed following the Bayesian point of view. Key words: Sample size; Misclassiﬁcation; Bayesian design AMS Subject Classiﬁcation: Primary 62K05, Secondary 62F15 1 Introduction The binomial parameter p is usually estimated by ˆ p unless the problem of misclassiﬁcation arises (see, for example, Bross (1954)). In this case, an un- derstanding of the level of misclassiﬁcation is desired. Therefore a double sampling technique, in which infallible and fallible devices are used to clas- sify people or objects into two categories (‘0’ or ‘1’), is utilized. The fallible device is error-prone, whereas the infallible device yields error-free results but costs considerably more. Ideally, an estimate of p could be derived by using the infallible device exclusively. However, due to cost constraints this may not be possible. On the other hand, classifying only by the fallible de- vice ensures the existence of bias in the estimate of p. As a compromise the following double sampling scheme is proposed: 1. In the ﬁrst stage, a sample of n 1 units will be classiﬁed by both devices while requiring the posterior variance of the misclassiﬁcation probabili- ties not to exceed a pre-speciﬁed level.