A log-linear model to estimate cheating in randomized-response M. Cruyff 1 , A. van der Hout 2 , P. van der Heijden 1 , and U. B¨ ockenholt 3 1 Utrecht University Utrecht, The Netherlands m.cruijff@fss.uu.nl 2 Medical Research Centre Cambridge, United Kingdom ardo.vandenhout@mrc-bsu.cam.ac.uk 3 McGill University Montreal, Canada ulf.bockenholt@mcgill.ca Summary. Randomized response (RR) is an interview technique designed to elim- inate response bias when sensitive questions are asked. In RR the answer depend to certain degree on to the outcome of a randomizing device. Although RR elicits more honest answers than direct questions, the method is susceptible to cheating, in the sense that respondents do not answer in accordance with the outcome of the ran- domizing device. In this paper we present a log-linear randomized-response model that accounts for cheating. The main results of this model are (1) an estimate of the probability of cheating; (2) log-linear parameters estimates describing the associa- tions between RR variables and; (3) prevalence estimates of the sensitive behavior that are corrected for cheating. We illustrate the model with two examples from a Dutch survey measuring non-compliance with social welfare rules. Key words: randomized response, log-linear model, evasive response bias, cheating parameter 1 Introduction Most people are reluctant to publicly answer questions about sensitive topics, like drug or alcohol (ab)use, sexuality or anti-social behavior. As a result, respondents may refuse to give the embarrassing answer and the stigmatizing behavior is often underreported. Randomized Response (RR) is an interview technique that is espe- cially developed to eliminate this kind of evasive response bias [War65, ENG92]. In RR the answer to the sensitive question is to a certain extent determined by a ran- domizing device, like a pair of dice or the draw of card. Since the outcome of the de- vice is known only to the respondent, confidentiality is guaranteed. A meta-analysis shows that RR yields more valid prevalence estimates than direct-questioning de- signs [LHH05]. Despite the protection of the respondents’ privacy, RR does not completely elim- inate the response bias. Several studies have shown that the RR design is susceptible