C. Fyfe et al. (Eds.): IDEAL 2008, LNCS 5326, pp. 250 257, 2008. © Springer-Verlag Berlin Heidelberg 2008 A Data Perturbation Method by Field Rotation and Binning by Averages Strategy for Privacy Preservation Mohammad Ali Kadampur and Somayajulu D.V.L.N. Department of Computer Science and Engineering National Institute of Technology Warangal-506004, A.P. India ali.kadmpur@gmail, soma@nitw.ac.in www.nitw.ac.in Abstract. In this paper a novel technique useful to guarantee privacy of sensi- tive data with specific focus on numeric databases is presented. It is noticed that analysts and decision makers are interested in summary values of the data rather than the actual values. The proposed method considers that the maximum in- formation lies in association of attributes rather than their actual proper values. Therefore it is aimed to perturb attribute associations in a controlled way, by shifting the data values of specific columns by rotating fields. The number of rotations is determined via using a support function for association rule han- dling and an algorithm that computes the best-choice rotation dynamically. Fi- nal summary statistics such as average, standard deviation of the numeric data are preserved by making bin average replacements for the actual values. The methods are tested on selected datasets and results are reported. 1 Introduction Privacy is defined as “freedom from unauthorized intrusion” [15]. It is a deterrent against individually identifiable data in the process of knowledge extraction. Data min- ing technology is used for extracting knowledge from vast quantities of data. However the use of this technology has raised the concern that individual privacy is violated. Therefore the data mining technique must ensure that any information disclosed 1. cannot be traced to an individual; or 2. does not constitute an intrusion. There are multiple approaches to achieve these goals[15]. Data perturbation is one of the methods for preserving privacy[2][12][15]. In perturbed data bases, if unauthor- ized data is accessed, the true value is not disclosed. Data perturbation techniques in effect distort the data in different ways before presenting it to the data mining algo- rithm, thus individually identifiable (private) values are not revealed. The privacy- preserving properties of such databases are a result of the perturbation. In this paper a composite novel method for data perturbation is proposed. 2 Related Work In order to distort the data and preserve individual privacy, researchers have employed methods such as data encryption[11][13], Data randomization[12][15], Data swapping