Process Mining as First-Order Classification Learning on Logs with Negative Events Stijn Goedertier 1 , David Martens 1 , Bart Baesens 1,2 , Raf Haesen 1,3 , and Jan Vanthienen 1 1 Department of Decision Sciences & Information Management, Katholieke Universiteit Leuven, Belgium {myFirstName.myLastName}econ.kuleuven.be 2 School of Management, University of Southampton, United Kingdom 3 Vlekho Business School, Belgium Abstract. Process mining is the automated construction of process models from information system event logs. In this paper we identify three fundamental difficulties related to process mining: the lack of neg- ative information, the presence of history-dependent behavior and the presence of noise. These difficulties can elegantly dealt with when pro- cess mining is represented as first-order classification learning on event logs supplemented with negative events. A first set of process discovery experiments indicates the feasibility of this learning technique. 1 Introduction Event logs of information systems such as ERP, Role Based Access Control, and Workflow Management systems conceal an untapped reservoir of knowledge about the way people conduct every-day business transactions. The vast quan- tity of available events, however, makes it difficult to analyze event logs using only descriptive statistics. Process mining, in contrast, is the automated con- struction of process models from event logs [1,2]. Process models that have been discovered through process mining enable organizations to compare the behavior in the event log with the business conduct it would expect from its employees and other stake holders. The latter can be helpful in the context of regulatory compliance or in the context of business process redesign and optimization. Cur- rently, many algorithms have been developed to describe or predict control-flow, data or resource-related aspects of processes. An important but difficult learning task in process mining is the discovery of sequence constraints from event logs, referred to as Process Discovery [3,4]. Other process learning tasks involve, for instance, learning allocation policies [5] and social networks [6]. Process mining faces many difficulties. One difficulty is that it is often limited to the much more difficult setting of unsupervised learning because negative information about state transitions that were prevented from taking place is often not available in the event log and consequently cannot guide the search problem. Moreover, much of the behavior displayed in processes is non-local, history-dependent behavior. While a history of related events is a potentially A. ter Hofstede, B. Benatallah, and H.-Y. Paik (Eds.): BPM 2007 Workshops, LNCS 4928, pp. 42–53, 2008. c Springer-Verlag Berlin Heidelberg 2008