International Journal of Computer Applications (0975 8887) Volume 118 No. 13, May 2015 26 Validation of Object Oriented Metrics for Evaluating Understandability of Data Warehouse Models Jaspreeti Singh Assistant Prof., CSE Dept USIT, GGSIPU, Dwarka Delhi-75 Srishti Vashishtha Student, M.Tech CSE Dept. USICT, GGSIPU, Dwarka Delhi-75 ABSTRACT Datawarehouse has a key role in formulating strategic decisions thus it is very essential to maintain its quality. Metrics have been generally used to direct designers to develop quality data models. Numerous researchers have proposed metrics for multidimensional models for datawarehouse. These metrics are required to be empirically validated to prove their practical utility. Empirical validation of the object oriented metrics for multidimensional models for data warehouses at a conceptual level is presented in the paper. Quality attribute understandability is assessed through various combinations of metrics. Univariate and Multiple linear regression analysis have been used in this paper for computing the multidimensional models quality. The results show that these metrics may be considered as key indicators for quality of multidimensional data models. Keywords: Datawarehouse, Metrics, Multidimensional models, Quality attributes 1. INTRODUCTION In today‟s world organizations need to gather, store and process huge volume of data that is needed to perform day to day operations. They are able to do this at a comparatively low cost but fail to provide quality information [6]. Inmon provided the solution of adopting datawarehouse, which is defined as “collection of subject-oriented, integrated, non- volatile data that supports the management decision process ‟‟ [4]. Datawarehouse is a type of environment for the purpose of providing strategic information for analysing, discerning trends and monitoring performance. It is essential for the organizations to ascertain the quality of the information that they are getting from the datawarehouse. The information quality in a datawarehouse is determined by the data quality, presentation quality and the quality of data model (conceptual, logical and physical data model) [3]. Our aim in this paper is to ensure quality by evaluating understandability of multidimensional models. Structural properties have been recognised as major factors influencing quality of a software product. Metrics based on structural properties have been widely used to assess the quality attributes like understandability, maintainability; fault-proneness etc. of a software artefact [8].Our present focus is on understandability as it is the key measure of quality of datawarehouse conceptual models. There is a relationship between structural properties, cognitive complexity, understandability and external quality attributes. Structural complexity affects cognitive complexity it in turn affects analyzability, understandability & modifiability; and these further affect external quality [15]. Hence, the structural complexity plays a vital role while assessing the quality of a model. These complexity metrics helps to estimate the quality of information provided by the datawarehouse. A slight error in the information can cause huge losses to organizations, thus it is important to maintain the quality. Researchers have recommended quality attributes for multidimensional models and have also proposed metrics to estimate these quality attributes. In this paper we focus on the quality attribute - understandability of multidimensional models. Metrics are the objective indicators of quality. They provide a way of measuring quality factors in a consistent and objective manner. Metrics could be useful to understand and improve software development and maintenance of projects and to maintain the quality of a system highlighting the key problematic areas. A set of metrics for data warehouse models is already presented to compute the structural complexity of a multidimensional model by Serrano et al. [15]. The author has suggested that these metrics need to be validated to ensure the practical utility of these metrics and to draw a final conclusion which may be applied in practice. Even though several quality frameworks for data models have been proposed, most of them lack valid quantitative measures to calculate the quality of conceptual data models in an objective way. This family of experiments is a significant aspect in the process of validating metrics as it is extensively accepted that only after executing a family of experiments; it is possible to develop the collective knowledge to extract constructive measurement conclusions to be applied in practice. [15] [16] Hence, in this paper, we have executed empirical validation by considering a dataset consisting of eighteen multidimensional schemas for a datawarehouse using correlation and linear regression. The rest of this paper is structured as follows: Section 2 explains the process of metrics validation. Section 3 discusses the metrics of the object oriented models for data warehouses. Section 4 elaborates the experimental setup. Section 5 shows the result of various analysis methodologies. Sections 6 discuss about the threats to the validity of the results and limitations of the study. In the end, Section 7 summarizes the work and presents the conclusion. 2. METRICS VALIDATION PROCESS Metrics Validation process has certain steps to ensure the reliability of the proposed metrics. It is necessary to follow these steps. Figure 1 presents the method we follow for the metrics proposal [3]. In this figure we have three central activities: METRIC DEFINITION THEORTICAL VALIDATION EMPIRICAL VALIDATION EXPERIMENTS CASE STUDIES