A flexible Bayesian algorithm for sample size calculations in misclassified data Hector E. Nistazakis a , Athanassios Katsis b, * a Department of Telecommunications Science and Technology, University of Peloponnese, Karaiskaki Street, Tripolis 22100, Greece b Department of Social and Educational Policy, University of Peloponnese, Damaskinou and Kolokotroni Street, Korinthos 20100, Greece Abstract The problem of obtaining a flexible and easy to implement algorithm in order to derive the optimal sample size when the data are subject to misclassification is critical to practitioners. The topic is addressed from the Bayesian point of view where a special structure of the a priori parameter information is investigated. The proposed methodology is applied in specific examples. Ó 2006 Elsevier Inc. All rights reserved. Keywords: Sample size; Misclassification; Bayesian point of view; Average coverage 1. Introduction The specification of the optimal sample size is one of the most immediate concerns of any applied researcher. Classical statistical theory provides well-known formulae for specific cases. However, the existing theory is not particularly helpful in more complex sampling situations, such as when misclassification is pres- ent. This is a common applied problem in binomial data where there are only two possible outcomes of the experiment. Therefore, it is highly unlikely that the data will always be accurately classified according to the true state of nature. For instance, in medicine the infallible classification of a potential patient regarding a certain disease is of paramount importance. However, despite recent advances in diagnostic procedures, misclassification occurs quite often. Other aspects of scientific research with similar problems is election polling where not all respondents provide their true voting record and quality control where certain characteristics of the sampling units are usually recorded by an imperfect device such as human inspection. Finally, misclassi- fication is a frequent problem in insurance and auditing where complicated legislation and multiple sources of payment (retirement benefits, sickness or unemployment compensations, outstanding claims, etc.) may 0096-3003/$ - see front matter Ó 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2005.12.071 * Corresponding author. E-mail addresses: enistaz@uop.gr (H.E. Nistazakis), katsis@uop.gr (A. Katsis). Applied Mathematics and Computation 184 (2007) 86–92 www.elsevier.com/locate/amc