Business Process Similarity Metric Supporting One-to-Many Relationship Maria Laura SEBU Computer and Software Engineering Department Politehnica University of Timisoara Timisoara, Romania laura.sebu@student.upt.ro Horia CIOCÂRLIE Computer and Software Engineering Department Politehnica University of Timisoara Timisoara, Romania horia.ciocarlie@cs.upt.ro Abstract — In many areas graph match techniques are used to compare and identify common characteristics. In this paper we apply graph similarity techniques on the business processes used inside organizations and extracted with process mining techniques. The scope is to identify if an organization uses a similar process for a specific business case as another organization. However as the existence of exact matching is less probable, error tolerant graph matching techniques are more suitable for real life data. Business processes could have a different granularity level; one business process is more detailed in specific areas than the business process subject of the comparison. The custom algorithm for business process match presented in this paper takes into consideration a one-to-many relation for activities: one activity is matched with a set of activities in the other graph. Such information is important in extracting the common characteristics of organizations and could represent an input for choosing a collaborator. Business processes if not available are extracted with process mining techniques and are reduced to directed graph format. A custom graph similarity algorithm extended for multivalent nodes is applied and a business process similarity factor is retrieved. Keywords — graph match, business proces similarity, process mining I. INTRODUCTION Graph based techniques represent an important tool in detecting common characteristics. In many applications an important operation is the comparison between two objects or between an object and a model. Graph matching is the technique of finding a correspondence between the nodes and the edges of two graphs. Matching could be performed at two levels: exact matching that requires a strict correspondence of the objects being matched or of the subparts. In inexact matching methods a matching tolerance level is set. A correspondence is found even if the objects compared are structurally different. Graph matching is univalent when each vertex is associated with one vertex in the other graph. Graph matching may be exact when all vertex and edge features must be preserved or error tolerant when some vertex and edge features may not be preserved by the matching. A multivalent match allows associating one vertex with a set of vertices [1]. This paper is supported by the Sectoral Operational Programme Human Resources Development POSDRU/159/1.5/S/137516 financed from the European Social Fund and by the Romanian Government A multivalent matching is suitable for business process comparison as one business process could have one topological representation with one granularity level and a different representation with a higher granularity level. The graph edit distance proposed in this paper allows univalent and multivalent match. One-to-many relation for modeling multivalent match could be performed automatically, or it could be predefined as a pre-requisite for the execution of the algorithm when an automatic match cannot be performed. Graph isomorphism is the strictest form of graph matching. The mapping must be bijective, the edge preserving condition is kept in both directions. Subgraph isomorphism is a weaker form of graph exact matching techniques which requires that a component of one graph is isomorphic to another graph.. The graph matching problems are all NP complete. Polynomial isomorphism algorithms have been developed for special kinds of graphs but not for the general case. We propose in this paper an extension of graph edit distance algorithm to consider multivalent matching and we present an effective heuristic method able to provide reliable results in polynomial execution time. From the matching techniques mentioned above, exact graph matching is rarely used in the business process management context. Graphs resulted from business processes could be obtained with process mining techniques by mining the event log dataset. Once the business processes are obtained, these are reduced to graph format. During this phase, for representing a business process as a graph, a lot of information is lost. In such a context, graph exact matching is almost impossible to be detected. But it could be used for comparison of standardized processes. Subgraph isomorphism could be used to check if a business process is part of another business process. It is also suitable for retrieving similarity for the subparts of standardized processes. Process standardization is a clear direction followed in process management. A process is successfully standardized if it is executed each time in a predefined way; the activities are executed in the same order and produce the same results. In Business Process Management as in many areas, identical match is rarely used for real life processes. More