Journal of Of®cial Statistics, Vol 13, No. 1, 1997, 75±89 A Bayesian Approach to Data Disclosure: Optimal Intruder Behavior for Continuous Data Stephen E. Fienberg, 1 Udi E. Makov, 2 and Ashish P. Sanil 3 1. Introduction There has been a longstanding government interest and concern in the United States and elsewhere over the con®dentiality of statistical data, especially as gathered in sample sur- veys and censuses. For example, the U.S. Bureau of the Census operates under Title 13 of the U.S. Code, virtually from its inception in 1929. Such legal guarantees of con®dentiality are not only a re¯ection of the public concerns regarding disclosure but also of the agen- cies' desire for high quality data. Even in the absence of legal restrictions on access to data, statistical agencies and survey researchers have always been concerned about the need to preserve the con®dentiality of respondents in order to ensure the quality of the data pro- vided, and these concerns have been heightened by the decline in response rates for cen- suses and surveys over the past two decades (e.g., see Panel on Privacy and Con®dentiality as Factors in Survey Response 1979; Fienberg 1993±1994). At the same time government agencies have an obligation to report their data widely and thus they recognize the need for some balance between strict con®dentiality (however it is to be interpreted) and the bene®ts derived from the release of statistical information. To In this article we develop an approach to data disclosure in survey settings by adopting a prob- abilistic de®nition of disclosure due to Dalenius. Our approach is based on the principle that a data collection agency must consider disclosure from the perspective of an intruder in order to ef®ciently evaluate data disclosure limitation procedures. The probabilistic de®nition and our attempt to study optimal intruder behavior lead naturally to a Bayesian formulation. We apply the methods in a small-scale simulation study using data adapted from an actual survey con- ducted by the Institute for Social Research at York University. Key words: Con®dentiality; disclosure limitation; inferential disclosure; identity disclosure; measurement error. q Statistics Sweden 1 Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A. 2 Department of Statistics, Haifa University, Haifa, 31905, Israel. 3 Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A. Acknowledgments: The preparation of this article was supported in part by a grant from the Natural Sciences and Engineering Research Council of Canada to the first author at York University, North York, Ontario, Canada, and in part by the U.S. Bureau of the Census. We are especially grateful to the Institute for Social Research at York University which provided us with the data used in Section 4, and to Sangeeta Agrawal for assistance with the structuring of the database and a variety of initial calculations. We received valuable comments and suggestions on earlier drafts of the manuscript from staff at the U.S. Bureau of the Census, an Associate Editor and two referees, and these have led to substantial changes and improvements to the article. A preliminary version of this article was presented at the Second International Seminar on Statistical Confidentiality, Luxembourg, November 1994, and has been published as Fienberg, Makov, and Sanil (1995) under the same title.