International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 1269
Cross Domain Data Fusion
Amit Wavhal
1
, Prof. Suhasini Itkar
2
1,2
Department of Computer Engineering P.E.S Modern College of Engineering
------------------------------------------------------------------***-----------------------------------------------------------------
Abstract—In recent years we have seen explosion of data on
the World Wide Web front. Most of the information is in the
unstructured format and very few information available is in
relational form or in some kind of structured form .Data
mining has come a long way in extraction of information
using such big information of data. This huge data also called
Big Data in todays scenario. But need of the hour is not only
extract information from one viewpoint or single dimension.
Seeing information from one perspective does not always
give the right picture, for any organization using BI or
business intelligent system decisions are taken using this
single dimension of data. But as world is changing fast and to
stay relevant in the market or remain market leader one need
to see the unseen dimension of data, to answer to this
question Data Fusion Technique comes into picture. Then
analysis is done for each step from load balancing, accuracy
and complexity aspects. Depending on the performance of the
algorithms the best solution is considered.
Index Terms—Big Data, cross-domain data mining,
data fusion, multi-modality data representation, deep
neural networks, multi-view learning, matrix
factorization, probabilistic graphical models, transfer
learning, urban computing.
I. INTRODUCTION
Over couple of years back data fusion information fusion
was seen as sensor data fusion technique, where various
sensor would produce data, information retrieved from those
sensor would be used to sense the external environment
situation, so to take decision accordingly. Fusion consists in
touching or merging information that branches from several
sources and exploiting, merged information in various task
such as answering questions, making decisions, numerical
estimations for further predictive analysis, for gaining
insights of the data pattern and used for retrieving useful
inferences.
Information fusion is process dealing with association,
correlation, and combination of data and information from
multiple sources to achieve refined estimates of parameters,
characteristics, events, and behaviors for observed objects in
an observed field of view. It is sometimes implemented for
automated decisions support system.
Integrated information systems provide users with a
unified view of multiple heterogeneous data source. Querying
the underlying data sources, combing the results, and
presenting them to user is performed by the integration
system.
With more and more information sources easily available
via cheap network connection, either over the Internet or in
company intranet, the desired to access all these source
through a consistent interface has been the driving force
behind much research in the field of information integration.
During the last three decades many systems that try to
accomplish this goal have been developed, with varying
degrees of success.
II. REVIEW OF LITERATURE
Data fusion is process consists of mapping source data into
target representation, identifying multiple representation of
the same real-word object, and finally combining these
representation called data fusion, while fusing data, we have
to take special care in handling data conflicts, this paper focus
on the definition and implementation as in [1], as well as high
level understanding, some of the technique which can applied
to data fusion in [2].
Different domain data-set is identified first, by applying
rules for selection of data-sets to do data fusion[1]. The
second step is to identifying the fields that needs to be used
for data fusion process. Once data-sets with right number of
fields are identified, data mining process is applied individual
data-set. Once knowledge is extracted from individual data-
set, inference data points are populated which can be used for
data fusion, once inferences or knowledge is extracted, then
two different domain data-sets are used for fusion. The most
important part in data fusion is to discover the association
between the different domain data sets. Once this fields
which are selected for fusion will discover the association
between this different domain fields and knowledge or
pattern would be discovered for showing our conclusion.
Integrated information systems must usually deal with
diversified representation of data (schemata).In order to
present to the user query results in a single unified schema,
the schematic (eterogeneity’s must be bridged. Data from the
data sources must be converted to Conform to the global
schema of the information system. Two methods are common
to bridge heterogeneity and thus specify data transformation: