Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320–088X IJCSMC, Vol. 2, Issue. 5, May 2013, pg.169 – 174 RESEARCH ARTICLE © 2013, IJCSMC All Rights Reserved 169 Improved Data Reduction Technique in Data Mining Pritesh Vora 1 , Bhavesh Oza 2 1 Information Technology, Gujarat Technological University, L.D. College of Engineering, Ahmedabad, India 2 Computer Engineering Department, Gujarat Technological University, L.D. College of Engineering, Ahmedabad, India 1 pritesh2212@gmail.com; 2 bhavesh_oza_2001@yahoo.com Abstract— In Data Mining, Data reduction is important issue now a day. Due to huge size of data but maximum of them is irrelevant to objective or some of the data is redundant, which leads to more processing power consumptions and wrong result generation. Data Reduction implies reducing the data but without compromising integrity of it. Decision Tree, attribute subset selections, clustering, data cube aggregation is different techniques basically used for data reduction. Decision tree is a highly effective structure which is gives the possible outcome. In a decision tree in which each branch node represent a choice between alternatives and each node represent the decision or classification. Here we see the generalize algorithm and apply the decision tree technique for reliable outcomes. Key Terms: - Data mining; Decision tree; Data reduction I. INTRODUCTION Today, the development of the computer technology and the degree of the informationization is getting higher and higher, so the people know that the data are needed by them is mass data on the present world. Data mining is the process of extracting important information and knowledge from the large database (mass data)[1].In these data, information and knowledge are implicit, which people do not know in advance, but potentially useful. At present, the decision tree has important data mining method. Decision tree is commonly used in decision analysis in data mining and machine learning to create knowledge structures that guide the decision making process. Accessing a large amount of data in database which is time consuming process and maintaining large amount of data, is very difficult. In database there are many irrelevant data, noisy data and also duplicate data. Now pre-processing on all this data increase the quality or make the data more feasible to operate. In a database there are many data duplication, irrelevant data and noisy data so to remove them data reduction techniques must be applied. II. DECISION TREE AND ID3 Decision tree provide the highly effective structure which can give the idea about possible outcomes. In decision tree is a tree in which each branch node represent a choice between a number of alternatives, and each leaf node represents a classification or decision. Every decision tree begins with what is termed a root node, considered to be the parent of every other node. Each node in the tree evaluates an attribute in the data and determines which path it should follow [1]. Typically, the decision test is based on comparing a value against some constant. Classification using a decision tree is performed by routing from the root node until arriving at a leaf node. In more generalize definition of the decision tree written in stepwise form: