Abstract—Model matching is at the core of different model management operations such as model evolution, consolidation, and retrieval. An accurate identification of the similarity and differences between the elements of the matched models leads to an accurate model matching, which, in turn, leads to better model management. Software metrics are the software engineer means to quantify the similarity between the elements of the matched models. In this paper, we empirically validate the use of different metrics for capturing the similarity and the differences between the elements of two matched UML class diagrams. The paper empirically investigates the improvement of the similarity assessment of the class diagrams through the weight calibration of compound metrics. The results, reported based on two case studies, show the superiority of the compound metrics over the individual metrics. Index Terms—Model matching, similarity metrics, reuse, weight calibration. I. INTRODUCTION Models in software development allow engineers to downscale the complexity of the software systems. They are the developer means for reasoning about the requirements, communicating with stakeholders, documenting the system to ease the maintenance task, generating test cases, etc [1]. As early stage artifacts, models also provide great reuse potential [2]. Typically, for each software system, there is a set of models that describe its structural, behavioral, and functional perspectives. Our focus in this paper is the structural perspective, modeled by the UML (Unified Modeling Language) class diagram (see Fig. 1 as an example of UML class diagrams). Overtime, engineers unavoidably find themselves dealing with large collections of models. These models represent different development concerns [3]. Additionally, these models are considered as a main source of knowledge which is captured from the minds of involved people. This knowledge is re-practiced each time new software is created, yet, when comparing software systems, we usually find 60% to 70% of a software product‟s functionality is common [4]. Thus, without effective management, it is possible to build a new system from scratch, yet a similar situation has been built before. This results in duplicated artifacts, and thus redundant maintenance cost and time. Therefore, it is of paramount importance to have a systematic way to access and reuse existing software models in an efficient way. One way with a Manuscript received August 20, 2014; revised October 31, 2014. This work was supported by the Deanship of Scientific Research at King Fahd University of Petroleum and Minerals (KFUPM), Saudi Arabia, under Research Grant 11-INF1633-04. The authors are with King Fahd University of Petroleum and Minerals, Dhahran, 31261, Saudi Arabia (e-mail: alkhiaty@kfupm.edu.sa, moataz@kfupm.edu.sa). great potential is to merge these models into a reference model that unifies there overlaps and explicates their differences. Another way is to have an efficient repository along with efficient retrieval mechanism. In both cases, model matching is a fundamental operation. Model matching is the task of identifying semantic correspondences between elements of two models [3], [5]. The task is error-prone and time consuming. It is error-prone due to the fact that these models, while representing similar functionalities, are modeled independently by different developers, and thus inconsistency, design differences, and conflicts among them are expected. Therefore, their similarity and differences must be accurately quantified to have an accurate match. Different metrics as well as different matching algorithms have been proposed in the literature to identify the similarity and the differences of the models to be matched, especially for UML diagrams [6]–[11]. In prior works, we presented the use of Simulated Annealing algorithm (SA) [12] and the greedy algorithm [13] for model matching using different similarity metrics. Two types of metrics were used, individual metrics (e.g., a metric which measures the similarity between two classes based on the lexical names of the two classes) and compound metrics (e.g. a metric which measures the similarity between two classes based on a combination of their names and their neighborhood information). Compound metrics, with even weight assignments for their constituents (individual metrics), showed more consistent performance, across the two case studies used, than did the individual metrics. However, their accuracy was limited by the confounding effects of some of their constituents. Calibrating the weights of the constituents of the compound metric can control the confounding effect resulting from some constituents and thus make the compound metric more sound measure [13], [14]. In this paper, we empirically validate the effect of the weight assignments for the different compound metrics, introduced in our prior work, on the matching accuracy. Using two case studies, we empirically show how the appropriate weight settings for the compound metric can improve the matching accuracy with consistent performance across the two case studies. Since our focus here is the UML class diagram, the word „model‟ henceforth will refer to a UML class diagram, which represents the structural view of a software system. Additionally, the words “element” and “class” are used exchangeably. The paper is organized as follows. In Section II, we introduce the comparison framework. Section III presents the empirical investigation setup and the data collection. Empirical results and analysis are presented in Section IV. The conclusion of this paper and some future directions are summarized in Section V. UML Class Diagrams: Similarity Aspects and Matching Lecture Notes on Software Engineering, Vol. 4, No. 1, February 2016 41 DOI: 10.7763/LNSE.2016.V4.221 Mojeeb Al-Rhman Al-Khiaty and Moataz Ahmed