IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 1, JANUARY 2011 111 Musical Instrument Classification Using Individual Partials Jayme Garcia Arnal Barbedo and George Tzanetakis, Member, IEEE Abstract—In a musical signals, the spectral and temporal con- tents of instruments often overlap. If the number of channels is at least the same as the number of instruments, it is possible to apply statistical tools to highlight the characteristics of each instrument, making their identification possible. However, in the underdeter- mined case, in which there are fewer channels than sources, the task becomes challenging. One possible way to solve this problem is to seek for regions in the time and/or frequency domains in which the content of a given instrument appears isolated. The strategy presented in this paper explores the spectral disjointness among instruments by identifying isolated partials, from which a number of features are extracted. The information contained in those features, in turn, is used to infer which instrument is more likely to have generated that partial. Hence, the only condition for the method to work is that at least one isolated partial exists for each instrument somewhere in the signal. If several isolated par- tials are available, the results are summarized into a single, more accurate classification. Experimental results using 25 instruments demonstrate the good discrimination capabilities of the method. Index Terms—Feature extraction, partialwise instrument classi- fication, spectral disjointness, underdetermined mixtures. I. INTRODUCTION T HE identification of the instruments that compose a mu- sical signal has received increasing attention in the last years. Such an interest is fed by the potential benefits that an accurate instrument classifier can bring to other digital audio applications. In particular, musical genre classification can be greatly improved if the instruments present in a given song are known, since this information can be used to narrow down the set of potential musical genres. Sound source separation algo- rithms can also explore such information, particularly if they deal with underdetermined signals. In this case, the knowledge about the instruments can be used to create instrument-specific rules to improve the quality of the sound source separation. Early work in the area was mainly devoted to the identifi- cation of instruments in monophonic signals. This problem is, in general, less challenging than the polyphonic case, since the Manuscript received September 04, 2009; revised December 09, 2009. Date of publication March 11, 2010; date of current version October 01, 2010. This work was supported by Foreign Affairs and International Trade Canada under a Post-Doctoral Research Fellowship Program (PDRF). The associate editor co- ordinating the review of this manuscript and approving it for publication was Dr. Dan Ellis. J. G. A. Barbedo was with the Department of Computer Science, University of Victoria, Victoria, BC V8W 3P6 , Canada . He is now with the Department of Communications, FEEC, UNICAMP C.P. 6101, CEP: 13.083-852, Campinas, SP, Brazil (e-mail: jgab@decom.fee.unicamp.br). G. Tzanetakis is with the Department of Computer Science, University of Victoria, Victoria, BC V8W 3P6 , Canada (e-mail: gtzan@cs.uvic.ca). Digital Object Identifier 10.1109/TASL.2010.2045186 instrument to be classified is isolated from the interference of any other sound source. Most of those proposals deal with gen- eral instruments [1]–[11], while a few others deal with specific cases, like classification of woodwinds [12], [13] and discrimi- nation between piano and guitar [14]. In the last years, a number of strategies capable of dealing with polyphonic musical signals have been proposed. Most of them have some important limitations. — Limited number of instruments: some of the methods pro- posed in the literature only work and/or were only tested for a small (six or less) set of instruments (e.g., [15]–[22]). — Low accuracy: in some cases the accuracy is below 50% even considering few instruments (e.g., [19], [23]). — Instrument combinations set a priori: in this case, the strategies try to classify the signals according to prede- fined combinations of instruments; hence, they fail if the mixture has a combination of instruments that was not considered in the training (e.g., [24], [25]). — Polyphony limited to duets: some strategies can only deal with two simultaneous instruments (e.g., [26], [27]). Thus, despite the clear advancements achieved in the last years, there are still many limitations that prevent instrument identification tools to be more widely used. This paper presents a simple and reliable strategy to identify instruments in poly- phonic musical signals that overcomes some of the main limi- tations faced by its predecessors. The identification uses a ma- jority decision based upon pairwise comparisons of instrument likelihoods. A related but more complex approach was used by Essid et al. [5] to classify solo musical phrases. The method presented here is basically a system in which majority rules are successively applied, as briefly described in the following. In real musical signals, simultaneous sources (instruments and vocals) normally have a high degree of correlation and overlap both in time and frequency, as a result of the underlying rules normally followed by western music (e.g., notes with integer ratios of pitch intervals). This can make the identifi- cation of instruments challenging. However, one can expect to find at least some unaffected partials throughout the signal, which can be explored to provide cues about the corresponding instrument. As a result of such an observation, the proposed al- gorithm extracts features individually for each partial that does not collide with any other partial (isolated partials). Each pair of instruments is characterized by a particular set of nine features, selected from a complete set of 34 features. Each partial is identified with one of the pair of instruments using a linear classifier. If such a feature is greater than a given threshold, it represents a certain instrument; otherwise, it represents the other one. A first majority rule is applied by summarizing the results of the nine features; as a result, each pair of instruments 1558-7916/$26.00 © 2010 IEEE