International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1269 Cross Domain Data Fusion Amit Wavhal 1 , Prof. Suhasini Itkar 2 1,2 Department of Computer Engineering P.E.S Modern College of Engineering ------------------------------------------------------------------***----------------------------------------------------------------- Abstract—In recent years we have seen explosion of data on the World Wide Web front. Most of the information is in the unstructured format and very few information available is in relational form or in some kind of structured form .Data mining has come a long way in extraction of information using such big information of data. This huge data also called Big Data in todays scenario. But need of the hour is not only extract information from one viewpoint or single dimension. Seeing information from one perspective does not always give the right picture, for any organization using BI or business intelligent system decisions are taken using this single dimension of data. But as world is changing fast and to stay relevant in the market or remain market leader one need to see the unseen dimension of data, to answer to this question Data Fusion Technique comes into picture. Then analysis is done for each step from load balancing, accuracy and complexity aspects. Depending on the performance of the algorithms the best solution is considered. Index Terms—Big Data, cross-domain data mining, data fusion, multi-modality data representation, deep neural networks, multi-view learning, matrix factorization, probabilistic graphical models, transfer learning, urban computing. I. INTRODUCTION Over couple of years back data fusion information fusion was seen as sensor data fusion technique, where various sensor would produce data, information retrieved from those sensor would be used to sense the external environment situation, so to take decision accordingly. Fusion consists in touching or merging information that branches from several sources and exploiting, merged information in various task such as answering questions, making decisions, numerical estimations for further predictive analysis, for gaining insights of the data pattern and used for retrieving useful inferences. Information fusion is process dealing with association, correlation, and combination of data and information from multiple sources to achieve refined estimates of parameters, characteristics, events, and behaviors for observed objects in an observed field of view. It is sometimes implemented for automated decisions support system. Integrated information systems provide users with a unified view of multiple heterogeneous data source. Querying the underlying data sources, combing the results, and presenting them to user is performed by the integration system. With more and more information sources easily available via cheap network connection, either over the Internet or in company intranet, the desired to access all these source through a consistent interface has been the driving force behind much research in the field of information integration. During the last three decades many systems that try to accomplish this goal have been developed, with varying degrees of success. II. REVIEW OF LITERATURE Data fusion is process consists of mapping source data into target representation, identifying multiple representation of the same real-word object, and finally combining these representation called data fusion, while fusing data, we have to take special care in handling data conflicts, this paper focus on the definition and implementation as in [1], as well as high level understanding, some of the technique which can applied to data fusion in [2]. Different domain data-set is identified first, by applying rules for selection of data-sets to do data fusion[1]. The second step is to identifying the fields that needs to be used for data fusion process. Once data-sets with right number of fields are identified, data mining process is applied individual data-set. Once knowledge is extracted from individual data- set, inference data points are populated which can be used for data fusion, once inferences or knowledge is extracted, then two different domain data-sets are used for fusion. The most important part in data fusion is to discover the association between the different domain data sets. Once this fields which are selected for fusion will discover the association between this different domain fields and knowledge or pattern would be discovered for showing our conclusion. Integrated information systems must usually deal with diversified representation of data (schemata).In order to present to the user query results in a single unified schema, the schematic (eterogeneity’s must be bridged. Data from the data sources must be converted to Conform to the global schema of the information system. Two methods are common to bridge heterogeneity and thus specify data transformation: