IJSRSET1732191 | 09 April 2017 | Accepted : 19 April 2017 | March-April-2017 [(2)2: 648-660] © 2017 IJSRSET | Volume 3 | Issue 2 | Print ISSN: 2395-1990 | Online ISSN : 2394-4099 Themed Section: Engineering and Technology 648 Information Security and Data Mining in Big Data Tejas P. Adhau* 1 , Prof. Dr. Mahendra A. Pund* 2 Department of Computer Science of Engineering/SGBAU University/PRMIT Badnera/Amravati, Maharashtra, India ABSTRACT The growing popularity and development of data mining technologies bring serious threat to the security of individual's sensitive information. An emerging research topic in data mining, known as privacy-preserving data mining (PPDM), has been extensively studied in recent years. The basic idea of PPDM is to modify the data in such a way so as to perform data mining algorithms effectively without compromising the security of sensitive information contained in the data. Current studies of PPDM mainly focus on how to reduce the privacy risk brought by data mining operations, while in fact, unwanted disclosure of sensitive information may also happen in the process of data collecting, data publishing, and information (i.e., the data mining results) delivering. In this paper, we view the privacy issues related to data mining from a wider perspective and investigate various approaches that can help to protect sensitive information. In particular, we identify four different types of users involved in data mining applications, namely, data provider, data collector, data miner, and decision maker. For each type of user, we focus on his privacy and how to protect sensitive information. Keywords: Data Mining, Sensitive Information, Privacy-Preserving Data Mining Provenance, Anonymization , Privacy Auction, Antitracking. I. INTRODUCTION Data mining has attracted more and more attention in recent years, probably because of the popularity of the``big data'' concept. Data mining is the process of examining large pre- existing databases in order to generate new information and the result gives direction to guide future activities. Data mining process is also used for the analysis of data for relationships that have not previously been discovered. The term data warehouse is used to store a database that is used for analysis. Warehouse should be able to tell you what type of data they want to view and at what levels relationships among data items they want to be able to view it. II. METHODS AND MATERIAL 1. The Process of KDD Generally three of the major data mining techniques are regression, classification and clustering. Data Mining also popularly known as Knowledge Discovery in Databases (KDD) [1] [2]. KDD widely used data mining technique is a process that includes data preparation, selection, and generate result patterns. Some issues involved in the entire KDD process are: Identify the goal of the KDD process. Understand application domain involved an the knowledge that's required. Select data set on which discovery is be performed. Alter the data as per the requirements. Simplify the data sets by removing unwanted variables and missing fields Match KDD goals with data mining methods to suggest hidden patterns. Choose data mining algorithms to discover hidden patterns. Search for patterns of interest in a particular representational form, which include classification rules or trees, regression and clustering. Interpret essential knowledge from the mined patterns.