International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 4 1998 - 2002 _______________________________________________________________________________________________ 1998 IJRITCC | April 2015, Available @ http://www.ijritcc.org _______________________________________________________________________________________ Improved Data Mining Analysis by Dataset creation using Horizontal Aggregation and B+ Tree Avisha Wakode, Mrs. D. A. Chaudhari, DYPCOE - Akurdi, Savitribai Phule Pune University Abstract—Data Mining is one of the emerging field in Research and information retrieval. Data mining tools requires data in the form of data set. Data set preparation is one of the important task in data mining. Data set is collection of data which is stored in relational database where database schema are highly normal- ized. To analyze data efficiency, data mining systems are widely using datasets with columns in horizontal tabular layout. The two main components of sql code is join and aggregation Vertical aggregations have limitations to build data sets because they return one column for aggregated group using group functions. Preparing a data set for data mining analysis is generally the most tedious and time consuming task in a data mining project, which requires many complex SQL queries, joining tables and columns, and aggregating columns. A powerful methods to generate SQL code to return aggregated columns in a horizontal or cross tabular form, returning a set of numbers instead of one number per row is introduced. This new class of methods is called horizontal aggregations. Horizontal aggregations are evaluted using three functions : CASE, SPJ and PIVOT method.Data mining also deals with searching of information. This paper focuses on creation of B+ tree to reduce the time of information search so that efficiency of the system increases. Keywords—Aggregation, PIVOT, SPJ, CASE, Dataset. __________________________________________________*****_________________________________________________ I. INTRODUCTION Data mining is the process of analyzing data from different dimensions,categorizing it and summarizing it into some useful information.Data mining technology automates the process of information search or information retrieval.In a relational database, a lots of effort is required to prepare a data set that can be used as input for a data mining or statistical algorithm. Most algorithms require data set as a input which is in horizon- tal form, with several records and one variable or dimension per column. There are different models like clustering, classi- fication, regression and PCA. There are different terms used to describe the data set. In data mining the term used to describe data set is point-dimension whereas Statistics literature and machine learning research uses observation variable and in- stance feature. Preparing the useful and appropriate data set for data mining, needs more time. There are two main components of SQL code are join and aggregation.The most well-known aggregation is the aggregating of a column over group of rows. There are many different aggregation functions and operators which includes sum(),count(),avg(),etc in SQL. But all these aggregations have limitations to prepare data sets for data mining purposes. With such drawback in mind, a new class of aggregate methods that aggregates numeric expressions and transpose data to produce a data set with a horizontal layout. Methods belonging to this new class of aggregate functions are called horizontal aggregations. Horizontal aggregations is an extended form of existing vertical aggregations, which return a set of values in a cross tabular form instead of a single value per row. Horizontal aggregations provide different unique features and advantages. Firstly, they provide a template to generate SQL code from a data mining tool. Secondly it minimizes manual work in a data mining project. Horizontal aggregations can be used by a data mining algorithm for data mining analysis. A new class of aggregate functions that can be used to prepare data sets in a horizontal layout or in cross tabular form extending SQL capabilities.There are different ways and methods of information retrieval.But how efficiently information is retrieved is an challenging task. II. LITERATURE SURVEY Aggregations plays an vital role in sql code.Aggregation is the grouping the values of multiple rows together to form a single value.There are two types of aggregation techniques which includes vertical aggregation and horizontal aggregation A. Vertical aggregation Existing sql aggregations are also called as vertical aggre- gation.Vertical aggregations return single value. The most com- mom vertical aggregation functions includes sum(), count(), avg(), min(), etc.The vertical aggregations are common for numerous programming such as relational algebra.The output that we get from existing aggregations cannot be directly used for data mining algorithm.Data mining algorithm requires data in the form of data set. To get data in the form of data set from the output of existing aggregations require joining tables and columns , aggregating columns and many complex queries.It means that vertical aggregations have limitations to prepare data set. B. Horizontal aggregation Horizontal aggregation returns set of values instead of single value.Horizontal aggregations returns the output in the form of horizontal layout or in summarized form.The output that we get from horizontal aggregation can be directly used for data mining.The limitation of vertical aggregation is overcome in horizontal