L-Cover: Preserving Diversity by Anonymity Lei Zhang 1 , Lingyu Wang 2 , Sushil Jajodia 1 , and Alexander Brodsky 1 1 Center for Secure Information Systems George Mason University Fairfax, VA 22030, USA {lzhang8,jajodia,brodsky}@gmu.edu 2 Concordia Institute for Information Systems Engineering Concordia University Montreal, QC H3G 1M8, Canada wang@ciise.concordia.ca Abstract. To release micro-data tables containing sensitive data, generalization algorithms are usually required for satisfying given privacy properties, such as k-anonymity and l-diversity. It is well accepted that k-anonymity and l-diversity are proposed for different purposes, and the latter is a stronger property than the former. However, this paper uncovers an interesting relationship between these two properties when the generalization algorithms are publicly known. That is, preserving l-diversity in micro-data generalization can be done by preserving a new property, namely, l-cover, which is to satisfy l-anonymity in a special way. The practical impact of this discovery is that it may potentially lead to better heuristic generalization algorithms in terms of efficiency and data utility, that remain safe even when publicized. 1 Introduction The micro-data release problem has attracted much attention due to increasing concerns over personal privacy. Various generalization techniques have been proposed to transform a micro-data table containing sensitive information for satisfying given privacy properties, such as k-anonymity [16] and l-diversity [2]. For example, in the micro-data table shown in Table 1, suppose each patient’s medical condition is to be kept confidential. The attributes can thus be classified into three classes, namely, identity (Name), quasi-identifiers (ZIP, Age), and sensitive value (Condition). Clearly, simply hiding the identity (Name) when releasing the table is not sufficient. A tuple and its sensitive value may still be linked to a unique iden- tity through the quasi-identifiers, if the combination (ZIP, Age) happens to be unique [16]. To prevent such a linking attack, the table needs to be generalized to satisfy k-anonymity. For example, if generalization (A) in Table 2 is released, then any linking attack can at best link an identity to a group of two tuples with the same combination (ZIP, Age). We can also see from the example that 1