12th International Society for Music Information Retrieval Conference (ISMIR 2011)
FEATURE EXTRACTION AND MACHINE LEARNING
ON SYMBOLIC MUSIC USING THE music21 TOOLKIT
Michael Scott Cuthbert Christopher Ariza Lisa Friedland
Music and Theater Arts
M.I.T.
cuthbert@mit.edu
Music and Theater Arts
M.I.T.
ariza@mit.edu
Department of Computer Science
University of Massachusetts Amherst
lfriedl@cs.umass.edu
ABSTRACT
Machine learning and artificial intelligence have great po-
tential to help researchers understand and classify musical
scores and other symbolic musical data, but the difficulty of
preparing and extracting characteristics (features) from
symbolic scores has hindered musicologists (and others
who examine scores closely) from using these techniques.
This paper describes the “feature” capabilities of music21,
a general-purpose, open source toolkit for analyzing,
searching, and transforming symbolic music data. The fea-
tures module of music21 integrates standard feature-
extraction tools provided by other toolkits, includes new
tools, and also allows researchers to write new and power-
ful extraction methods quickly. These developments take
advantage of the system’s built-in capacities to parse di-
verse data formats and to manipulate complex scores (e.g.,
by reducing them to a series of chords, determining key or
metrical strength automatically, or integrating audio data).
This paper’s demonstrations combine music21 with the
data mining toolkits Orange and Weka to distinguish works
by Monteverdi from works by Bach and German folk mu-
sic from Chinese folk music.
1. INTRODUCTION
As machine learning and data mining tools become ubiqui-
tous and simple to implement, their potential to classify da-
ta automatically, and to point out anomalies in that data, is
extending to new disciplines. Most machine learning algo-
rithms run on data that can be represented as numbers.
While many types of datasets naturally lend themselves to
numerical representations, much of the richness of music
(especially music expressed in symbolic forms such as
scores) resists easily being converted to the numerical
forms that enable classification and clustering tasks.
The amount of preprocessing needed to extract the most
musically relevant data from notation encoded in Finale or
Sibelius files, or even MIDI files, is often underestimated:
musicologists are rarely content to work only with pitch
classes and relative note lengths—to name two easily ex-
tracted and manipulated types of information. They also
want to know where a pitch fits within the currently im-
plied key, whether a note is metrically strong or weak, what
text is being sung at the same time, whether chords are in
open or closed position, and so on. Such processing and
analysis steps need to run rapidly to handle the large reper-
tories now available. A robust system for data mining needs
to integrate reliable and well-developed classification tools
with a wide variety of methods for extracting data from
large collections of scores in a variety of encodings.
The features module newly added to the Python-
based, open source toolkit music21, provides this needed
bridge between the demands of music scholars and of com-
puter researchers. Music21 [3] already has a well-
developed and expandable framework for importing scores
and other data from the most common symbolic music for-
mats, such as MusicXML [4] (which Finale, Sibelius,
MuseScore, and other notation software can produce),
Kern/Humdrum [6], CCARH’s MuseData [11], Notewor-
thy Composer, the common folk-music format ABC [10],
and MIDI. Scores can easily be transformed from symbolic
to sounding representations (by uniting tied notes or mov-
ing transposing instruments to C, for instance); simultanei-
ties can be reduced to chords that represent the pitches
sounding at any moment; and the key or metrical accents of
a passage can be analyzed (even for passages that change
key without a change in key signature).
The features module expands music21’s data mining
abilities by adding a battery of commonly used numeric
features, such as numerical representations of elements pre-
sent or absent in a piece (0s or 1s, used, for example, to in-
dicate the presence of a change in a time signature), or con-
tinuous values representing prevalence (for example, the
percentage of all chords in a piece that are triadic). Collec-
tions of these features can be used to train machine learning
software to classify works by composer, genre, or dance
type. Or, making use of notational elements found in cer-
tain input formats, they could classify works by graphical
characteristics of particular interest to musicologists study-
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page.
© 2011 International Society for Music Information Retrieval
387