Metrika manuscript No. (will be inserted by the editor) Sample size determination of binomial data with the presence of misclassification Athanassios Katsis Department of Social & Educational Policy University of Peloponnese 68 Dikearhou Street, 11636 Athens, Greece email: katsis@uop.gr Received: date / Revised version: date Abstract We propose a double sampling scheme with two classifiers to address the problem of optimal sample size when misclassification among binomial observations is observed. The classifiers vary with respect to the classifying cost and precision. Furthermore, since the data are unknown, an additional constraint is set on the probability of observing “undesirable” data. The method is developed following the Bayesian point of view. Key words: Sample size; Misclassification; Bayesian design AMS Subject Classification: Primary 62K05, Secondary 62F15 1 Introduction The binomial parameter p is usually estimated by ˆ p unless the problem of misclassification arises (see, for example, Bross (1954)). In this case, an un- derstanding of the level of misclassification is desired. Therefore a double sampling technique, in which infallible and fallible devices are used to clas- sify people or objects into two categories (‘0’ or ‘1’), is utilized. The fallible device is error-prone, whereas the infallible device yields error-free results but costs considerably more. Ideally, an estimate of p could be derived by using the infallible device exclusively. However, due to cost constraints this may not be possible. On the other hand, classifying only by the fallible de- vice ensures the existence of bias in the estimate of p. As a compromise the following double sampling scheme is proposed: 1. In the first stage, a sample of n 1 units will be classified by both devices while requiring the posterior variance of the misclassification probabili- ties not to exceed a pre-specified level.