Mining Predictive Process Models out of Low-level Multidimensional Logs Francesco Folino, Massimo Guarascio, and Luigi Pontieri National Research Council of Italy (CNR), via Pietro Bucci 41C, I87036 Rende (CS), Italy {ffolino,guarascio,pontieri}@icar.cnr.it Abstract. Process Mining techniques have been gaining attention, es- pecially as concerns the discovery of predictive process models. Tradi- tionally focused on workflows, they usually assume that process tasks are clearly specified, and referred to in the logs. This limits however their application to many real-life BPM environments (e.g. issue track- ing systems) where the traced events do not match any predefined task, but yet keep lots of context data. In order to make the usage of predictive process mining to such logs more effective and easier, we devise a new ap- proach, combining the discovery of different execution scenarios with the automatic abstraction of log events. The approach has been integrated in a prototype system, supporting the discovery, evaluation and reuse of predictive process models. Tests on real-life data show that the approach achieves compelling prediction accuracy w.r.t. state-of-the-art methods, and finds interesting activities’ and process variants’ descriptions. Keywords: Business Process Analysis, Data Mining, Prediction. 1 Introduction Process Mining techniques aim at extracting useful information from historical process logs, possibly in the form of descriptive or predictive process models, which can support the analysis, design, and improvement of business processes. An emerging research stream [1,9,4] concerns the induction of models for pre- dicting a given performance measure for new cases at run time. Originally focused on workflow systems, Process Mining research has been moving towards less structured processes, possibly featuring a wide variety of behaviors and many low-level tasks. This calls for enhancing classical approaches with the capability to capture diverse execution scenarios (a.k.a. “process vari- ants”), and to map log events to high-level activity concepts [3], in order to prevent the construction of useless models giving a cumbersome and undergen- eralized view of process behavior. The need of providing expressive process views is also witnessed by the prolif- eration of works on activity abstraction [12,11,7] and on log clustering [14,9,4], as well as by recent efforts to model different process variants and their link to M. Jarke et al. (Eds.): CAiSE 2014, LNCS 8484, pp. 533–547, 2014. c Springer International Publishing Switzerland 2014