International Journal of Software Engineering & Applications (IJSEA), Vol.10, No.5, September 2019 DOI: 10.5121/ijsea.2019.10503 45 DATA VIRTUALIZATION FOR DECISION MAKING IN BIG DATA Manoj Muniswamaiah, Tilak Agerwala and Charles Tappert Seidenberg School of CSIS, Pace University, White Plains, New York ABSTRACT Data analytics and Business Intelligence (BI) are essential components of decision support technologies that gather and analyze data for faster and better strategic and operational decision making in an organization. Data analytics emphasizes on algorithms to control the relationship between data offering insights. The major difference between BI and analytics is that analytics has predictive competence which helps in making future predictions whereas Business Intelligence helps in informed decision-making built on the analysis of past data. Business Intelligence solutions are among the most valued data management tools whose main objective is to enable interactive access to real-time data, manipulation of data and provide business organizations with appropriate analysis. Business Intelligence solutions leverage software and services to collect and transform raw data into useful information that enable more informed and quality business decisions regarding customers, market competitors, internal operations and so on. Data needs to be integrated from disparate sources in order to derive valuable insights. Extract-Transform-Load (ETL), which are traditionally employed by organizations help in extracting data from different sources, transforming and aggregating and finally loading large volume of data into warehouses. Recently Data virtualization has been used to speed up the data integration process. Data virtualization and ETL often serve unique and complementary purposes in performing complex, multi-pass data transformation and cleansing operations, and bulk loading the data into a target data store. In this paper we provide an overview of Data virtualization technique used for Data analytics and BI. KEYWORDS Data Analytics, Business Intelligence, Big data, Data Virtualization, ETL and Data Integration. 1. INTRODUCTION Success of an organization depends upon their business strategies and decision-making process. These decisions are heavily dependent on the collection and analysis of data been gathered. In the 1990s, during which business intelligence and analytical development were just blooming, data generated through various legacy systems were mostly structured. The decision support systems were based on these data collected that were stored in relational databases. These databases also supported queries, online analytical processing and reporting on the enterprise-specific data. Besides reporting functionalities, additional data mining techniques such as clustering, regression analysis, anomaly detection and classifications was also supported. In the early 2000s the raise of internet helped HTTP-based web search engines such as Google, Yahoo and e-commerce companies such as Amazon to deliver their businesses online and interact with their users directly. Companies began collecting user specific data through server logs and cookies in order to better understand user behaviors and identify new business opportunities. Web based analytics tools such Google and Adobe analytics was developed to determine the user clickstream data logs that hinted the user’s behavior across web pages and help in better key conversion metrics of a website. These tools provide key insights through user path analysis, user