MIREX 2010 SYMBOLIC MELODIC SIMILARITY: LOCAL ALIGNMENT WITH GEOMETRIC REPRESENTATIONS Julián Urbano, Juan Lloréns, Jorge Morato and Sonia Sánchez-Cuadrado University Carlos III of Madrid Department of Computer Science jurbano@inf.uc3m.es llorens@inf.uc3m.es jorge@ie.inf.uc3m.es ssanchec@ie.inf.uc3m.es ABSTRACT This short paper describes four submissions to the Symbolic Melodic Similarity task of the MIREX 2010 edition. All four submissions rely on a local-alignment approach between sequences of n-grams, and they differ mainly on the substitution score between two n-grams. This score is based on a geometric representation that shapes musical pieces as curves in the pitch-time plane. One of the systems described ranked first for all ten effectiveness measures used and the other three ranked from second to fifth, depending on the measure. 1. INTRODUCTION The problem of Symbolic Melodic Similarity, where a retrieval system is expected to retrieve a ranked list of musical pieces deemed similar to another one (i.e. the query), has been approached from very different points of view [1]. Some techniques are based on geometric representations of music, others rely on classic n-gram representations to calculate similarities, and others use editing distances and alignment algorithms. In a previous work we mixed these three major approaches [2]. We modeled melodies as sequences of overlapping n-grams of 3consecutive notes, and then they were compared using a modified version of the Smith- Waterman local-alignment algorithm [3]. The substitution score between two n-grams was calculated based on a geometric interpretation of the notes within the n-grams, which considers musical pieces as curves in the pitch- time plane. We have improved this approach and submitted four variations to the current 2010 edition of MIREX: Domain, PitchDeriv, ParamDeriv and Shape. In the next section we describe the local-alignment approach we followed, discussing the insertion, deletion and match scores common to all four submissions. Section 3 describes how the substitution score is calculated in each case and Section 4 shows the re- ranking phase. Section 5 discusses the results and the paper then finishes with conclusions and discussion. 2. LOCAL-ALIGNMENT We implemented a heuristic very similar to the classical TF-IDF (Term Frequency-Inverse Document Frequency) in Text Information Retrieval: the more frequent an n- gram is in the document collection, the less important it is for the comparison of two documents. Thus, the insertion, deletion and match scores between two n-grams are adapted as follows: Insertion: s(-, n) = -(1 - f(n)). An infrequent added n-gram penalizes more than a frequent one. Deletion: s(n, -) = -(1 - f(n)). An infrequent missed n-gram penalizes more than a frequent one. Match: s(n, n) = 1 - f(n). An infrequent matched n- gram rewards more than a frequent one. where f(n) indicates the frequency of the n-gram n in the document collection. The representation schema used for the n-grams at this point is directed-interval. 3. SUBSTITUTION SCORES The four systems submitted differ on the substitution function s(n, m) used by the local-alignment algorithm. Next, we describe how they are calculated in each case. 3.1 JU1: Domain The substitution score s(n, m) is calculated as the average of the absolute values of the directed interval differences between the corresponding notes of the two n-grams. For example, s(71, 70, 71, 78, 73, 74) would be:   This system ignores completely the time dimension of music, but presents the advantage of being transposition invariant. 3.2 JU2: PitchDeriv In this case, the n-grams are represented as curves in the pitch-time plane. Each note is arranged in the plane according to its pitch height and its onset time, and then we calculate the interpolating curve passing through the notes (see Figure 1). From that point on, only the curve is used to compare the n-gram to another one. Figure 1. Melody represented as a curve in the pitch-time plane. Pitch Time This document is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License. http://creativecommons.org/licenses/by-nc-sa/3.0/ © 2010 The Authors