Discovering Strong Principles of Expressive Music Performance with the PLCG Rule Learning Strategy Gerhard Widmer Dept. of Medical Cybernetics and Artificial Intelligence, University of Vienna, and Austrian Research Institute for Artificial Intelligence, Vienna gerhard@ai.univie.ac.at Abstract. We present a new rule learning algorithm named PLCG — a kind of ensemble learning method — that can find simple, robust partial theories (sets of classification rules) in complex data where neither high coverage nor high precision can be expected. The motivating applica- tion problem comes from an interdisciplinary research project that aims at discovering fundamental principles of expressive music performance from large amounts of complex real-world data (measurements of actual performances by concert pianists). It is shown that PLCG succeeds in finding some surprisingly simple and robust performance principles, some of which represent truly novel and musically meaningful discoveries. A more systematic experiment shows that PLCG learns significantly sim- pler theories than more direct approaches to rule learning, while striking a compromise between coverage and precision. 1 Introduction The research described in the present paper is part of a large, long-term inter- disciplinary research project situated at the intersection of the scientific disci- plines of Musicology and AI [12]. The goal is to use intelligent data analysis methods to study the complex phenomenon of expressive music performance. We want to understand what great musicians do when they interpret and play a piece of music, and to what extent an artist’s musical choices are constrained or ‘explained’ by (a) the structure of the music, (b) common performance practices, and (c) cognitive aspects of music perception and comprehension. Formulating formal, quantitative models of expressive performance is one of the big open re- search problems in contemporary empirical musicology. Our project develops a new direction in this field: we use inductive machine learning to discover general and valid expression principles from (large amounts of) real performance data. The purpose of this research is knowledge discovery. We search for simple, general, interpretable models of aspects of expressive music performance (such as tempo and expressive timing, dynamics, articulation). To that end, we have compiled what is most probably the largest set of performance data (precise measurements of timing, dynamics, etc. of real musical performances) ever col- lected in empirical performance research. Specifically, we are analyzing large L. De Raedt and P. Flach (Eds.): ECML 2001, LNAI 2167, pp. 552–563, 2001. c Springer-Verlag Berlin Heidelberg 2001