12th International Society for Music Information Retrieval Conference (ISMIR 2011) FEATURE EXTRACTION AND MACHINE LEARNING ON SYMBOLIC MUSIC USING THE music21 TOOLKIT Michael Scott Cuthbert Christopher Ariza Lisa Friedland Music and Theater Arts M.I.T. cuthbert@mit.edu Music and Theater Arts M.I.T. ariza@mit.edu Department of Computer Science University of Massachusetts Amherst lfriedl@cs.umass.edu ABSTRACT Machine learning and artificial intelligence have great po- tential to help researchers understand and classify musical scores and other symbolic musical data, but the difficulty of preparing and extracting characteristics (features) from symbolic scores has hindered musicologists (and others who examine scores closely) from using these techniques. This paper describes the “feature” capabilities of music21, a general-purpose, open source toolkit for analyzing, searching, and transforming symbolic music data. The fea- tures module of music21 integrates standard feature- extraction tools provided by other toolkits, includes new tools, and also allows researchers to write new and power- ful extraction methods quickly. These developments take advantage of the system’s built-in capacities to parse di- verse data formats and to manipulate complex scores (e.g., by reducing them to a series of chords, determining key or metrical strength automatically, or integrating audio data). This paper’s demonstrations combine music21 with the data mining toolkits Orange and Weka to distinguish works by Monteverdi from works by Bach and German folk mu- sic from Chinese folk music. 1. INTRODUCTION As machine learning and data mining tools become ubiqui- tous and simple to implement, their potential to classify da- ta automatically, and to point out anomalies in that data, is extending to new disciplines. Most machine learning algo- rithms run on data that can be represented as numbers. While many types of datasets naturally lend themselves to numerical representations, much of the richness of music (especially music expressed in symbolic forms such as scores) resists easily being converted to the numerical forms that enable classification and clustering tasks. The amount of preprocessing needed to extract the most musically relevant data from notation encoded in Finale or Sibelius files, or even MIDI files, is often underestimated: musicologists are rarely content to work only with pitch classes and relative note lengths—to name two easily ex- tracted and manipulated types of information. They also want to know where a pitch fits within the currently im- plied key, whether a note is metrically strong or weak, what text is being sung at the same time, whether chords are in open or closed position, and so on. Such processing and analysis steps need to run rapidly to handle the large reper- tories now available. A robust system for data mining needs to integrate reliable and well-developed classification tools with a wide variety of methods for extracting data from large collections of scores in a variety of encodings. The features module newly added to the Python- based, open source toolkit music21, provides this needed bridge between the demands of music scholars and of com- puter researchers. Music21 [3] already has a well- developed and expandable framework for importing scores and other data from the most common symbolic music for- mats, such as MusicXML [4] (which Finale, Sibelius, MuseScore, and other notation software can produce), Kern/Humdrum [6], CCARH’s MuseData [11], Notewor- thy Composer, the common folk-music format ABC [10], and MIDI. Scores can easily be transformed from symbolic to sounding representations (by uniting tied notes or mov- ing transposing instruments to C, for instance); simultanei- ties can be reduced to chords that represent the pitches sounding at any moment; and the key or metrical accents of a passage can be analyzed (even for passages that change key without a change in key signature). The features module expands music21’s data mining abilities by adding a battery of commonly used numeric features, such as numerical representations of elements pre- sent or absent in a piece (0s or 1s, used, for example, to in- dicate the presence of a change in a time signature), or con- tinuous values representing prevalence (for example, the percentage of all chords in a piece that are triadic). Collec- tions of these features can be used to train machine learning software to classify works by composer, genre, or dance type. Or, making use of notational elements found in cer- tain input formats, they could classify works by graphical characteristics of particular interest to musicologists study- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. © 2011 International Society for Music Information Retrieval 387