International Journal of Modern Communication Technologies & Research (IJMCTR) ISSN: 2321-0850, Volume-2, Issue-2, February 2014 14 www.erpublication.org Abstract — The Data Warehousing in simple terms is to create a central location and enduring storage space for the numerous data sources required to support a company’s analysis, reporting and other Business Intelligence functions. It provides the Business with deep insight into the existing data, which helps them in taking decisions more efficiently. Dependence of Business on Data Warehouse is tremendously increasing. As the decisions are based on the information lying in its databases, competitive advantage is gained using this technology. Thus importance of Data Warehouse can’t be denied. In this paper the data warehouse is discussed in detail with thorough study of its architecture, from the acquisition of the data to its detailed design, its storage and access, and metadata management components. Index Terms— Data Cubes, Data Marts, Meta Data, Roll-Up Display. I. INTRODUCTION Data Warehouse is “a subject-oriented, integrated, non-volatile, time-variant collection of data in support of management’s decision” (Inmon, W.H.,1992, in Elmasri and Navathe,2000,p. 842). The data is extracted from heterogeneous operational systems and external data sources, cleansed in order to ensure validity, transformed so as to remove inconsistencies and homogenize the data, aggregated, and in the last loaded into data warehouse. The data once inserted in data warehouse cannot be changed until modifications of the source data are propagated into the warehouse, but can be deleted. In other words, data warehousing comprises a set of decision support technologies, which support the knowledge worker (executive, analyst, manager etc.) with adequate and high quality information, so that they can make better and faster decisions. II. FEATURES OF DATA IN DATA WAREHOUSE The definition given specifies few of the features of data warehouse, which are as follows: - A. Subject-Oriented Data in data warehouse is subject-oriented, which means that the data queried is related with some specific subject area (e.g., products, customers, regions etc.) B. Integrated Manuscript received Feb. 20, 2014. The data is integrated from several, possibly heterogeneous operational systems such as database systems, flat files, etc. and also from various external data sources like World Wide Web, statistical databases etc., in the data warehouse. The data from various sources is homogenized before the integration takes place i.e. theformat inconsistencies are removed. The data collected from various sources can be incomplete or erroneous. So in order to ensure validity the data is cleaned up. Then the data is installed in the data model of the warehouse. C. Non-Volatile This feature of data warehouse states that warehouse data is mostly non-volatile, which implies that the data is read-only. “The term non-volatile means that, once inserted, data cannot be changed, though it might be deleted.”(Date, C.J. 2000). Any changes in the data take place only when modifications of the source data are propagated into the warehouse. D. Time Variant The time variant feature indicates the need to access historical data, which is one of the reasons for adopting data warehouse. As decision-making requires various Business Trend Analysis, for which historical data is required. Data warehouse keeps the periodical snapshots of the corresponding operational data, which is necessary in various analysis with respect to the time. E. Different from OLTP Databases The traditional Online Transaction Processing (OLTP) systems are not a right choice to provide support in decision-making. Data warehouse supports On-line Analytical Processing (OLAP). In OLTP even if the high speed networks are established but still the information accessibility problems persist because of the following reasons: - OLTP database maintain current data in great detail. Each transaction requires detailed, up-to-date data, where that part of the database is updated, which is accessed, immediately after any operation completed on the database. But in OLAP, rather than detailed data, the historical, summarized and consolidated data is of more importance because more stress is on decision making. Data in OLTP can be hundreds of megabytes to gigabytes in size. Whereas the size of data in data warehouse can be much larger than operational databases. It can vary from hundreds of gigabytes to terabytes in size. The performance key in OLTP is the maximization of the transaction throughput whereas in OLAP Query throughput and response times are more important than transaction throughput. Architecture of Data Warehouse: A Comprehensive Study Rashmi Bhatia