Proceedings of the XIV Colloquium on Musical Informatics (XIV CIM 2003), Firenze, Italy, May 8-9-10, 2003 EXPRESSIVE CLASSIFIERS AT CSC: AN OVERVIEW OF THE MAIN RESEARCH STREAMS Sergio Canazza, Giovanni De Poli, Luca Mion, Antonio Rod` a, Alvise Vidolin, Patrick Zanon CSC - DEI, University of Padua Via Gradenigo 6/a, 35131 Padova, Italy {canazza,depoli,randy,vidolin,patrick}@dei.unipd.it,ar@csc.unipd.it http://www.dei.unipd.it/ricerca/csc/ ABSTRACT Music can be a communication mean between performer and listener. Several studies demonstrated how different expressive in- tentions can be conveyed by a musical performance and correctly recognized by the listener. Some models for the synthesis can be found in the literature. In this paper we describe three automatic expressive analysis methods based on studies made at CSC during the last year. A brief overview of their implementation is presented. Finally, some results of the validation are sketched. Keywords: Machine Recognition of Music, Music Analysis, Psychoacoustics, Perception, Cognition, Real-time Systems, and Studio Report. 1. INTRODUCTION Playing music is a complex task that requires to professional pi- anists high motor control skills and fine cognitive abilities [1]. The former allows the performer to act on the instrument with high pre- cision and the latter is used as a feedback to tune and correct the movements during the piece. Together these two aspects allow him to communicate his interpretation of the piece. On the other side, listeners use their cognitive capacity to understand what the performer is communicating with his performance. It has proven that music can be played in different ways in order to commu- nicate the structural interpretation of the piece, tensions [2] and expressive content [3]. There is a general agreement among per- formers and audience on this kind of communication [4]. Starting form these observations, several models to synthesize expressive performances has been developed [5], [6]. These works allowed some studies on the analysis side using the “analysis through the synthesis” approach [7] in which the synthesis models were used to understand the expressive content of a musical performance. The expressive analysis of musical performances can be used for the realization of the Automatic Content Processing (ACP) that is essential in the today’s rapidly evolving panorama of the Internet exchange of multimedia. In fact, Musical Information Retrieval (MIR) is an active research field since the techniques developed for the indexing of the textual information are inappropriate for the multimedia. ACP can also be used according to the MPEG 7 stan- dard for the content description of multimedia products. Several projects where developed with this aim. For example, CUIDADO [8] is aiming to provide a sound palette engine and a music re- trieval system as well; they are flexible and they can adapt their retrieval capabilities according to the users’ preferences. Other projects were developed outside Europe, for example the Machine listening project of Hashimoto [9], [10], in which a computer is continuously listening to pieces and training itself to recognize some authors or styles; moreover, a computer is also searching in internet for new timbre sounds and classifying them using machine learning techniques. Research is currently developing on the au- tomated structural analysis by Dannenberg [11]: a computer was instructed to recognize the sections’ structure of audio data. Ma- chine learning techniques are used to recognize the performer’s style [12] at ¨ OFAI, and by Dannenberg at al. to classify the mu- sical styles [13]. Friberg and Bresin [14] work on the audio cues extraction and classification for the expressive content. Their work is the most similar according to our approach; the main difference deals with the kind of data used: MIDI data in our case, instead of Audio data. In this paper we will deal with the automatic expressive anal- ysis tool of musical performances developed at CSC in the last year. These methods give an insight on this complex task and can provide some ideas on future works. The paper is organized as follow: in the next section a brief description of the method used to derive the analysis algorithms is presented with a brief description of each of them. Then a more detailed description of the implementation of the algorithm is pre- sented. The third section is devoted to some validation. 2. METHODS At CSC several experiments were conducted, both on well-known pieces, and on improvisations. The data were recorded into MIDI files, then several perceptual and acoustical analysis were carried out in order to extract models. In the former approach, experiments on known pieces yielded a synthesis model [5] that is able to synthesize a performance con- veying an expressive intention by transforming a neutral one (i.e. a literal human performance of the score without any expressive intention or stylistic choice), both with reference to the score. The model uses the results of several perceptual tests and acoustic anal- ysis [15] and acts on the score by adding some micro deviations in the acoustical parameters (tempo, legato and loudness) that a performer usually intro-duces while playing. This is made by a transformation of the values already present in the neutral perfor- mance by means of two sets of coefficients named K-coefficients and M-coefficients: a K-coefficient changes the average values of an acoustic quantity (for example, the tempo), and the respective M-coefficient is used to scale the deviations of the actual values of the same parameter from the average. In this way, each expres- sive intention can be represented by a set of 6 parameters. The CIM-1