IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 19, NO. 1, JANUARY 2011 111 Musical Instrument Classiﬁcation Using Individual Partials Jayme Garcia Arnal Barbedo and George Tzanetakis, Member, IEEE Abstract—In a musical signals, the spectral and temporal con- tents of instruments often overlap. If the number of channels is at least the same as the number of instruments, it is possible to apply statistical tools to highlight the characteristics of each instrument, making their identiﬁcation possible. However, in the underdeter- mined case, in which there are fewer channels than sources, the task becomes challenging. One possible way to solve this problem is to seek for regions in the time and/or frequency domains in which the content of a given instrument appears isolated. The strategy presented in this paper explores the spectral disjointness among instruments by identifying isolated partials, from which a number of features are extracted. The information contained in those features, in turn, is used to infer which instrument is more likely to have generated that partial. Hence, the only condition for the method to work is that at least one isolated partial exists for each instrument somewhere in the signal. If several isolated par- tials are available, the results are summarized into a single, more accurate classiﬁcation. Experimental results using 25 instruments demonstrate the good discrimination capabilities of the method. Index Terms—Feature extraction, partialwise instrument classi- ﬁcation, spectral disjointness, underdetermined mixtures. I. INTRODUCTION T HE identiﬁcation of the instruments that compose a mu- sical signal has received increasing attention in the last years. Such an interest is fed by the potential beneﬁts that an accurate instrument classiﬁer can bring to other digital audio applications. In particular, musical genre classiﬁcation can be greatly improved if the instruments present in a given song are known, since this information can be used to narrow down the set of potential musical genres. Sound source separation algo- rithms can also explore such information, particularly if they deal with underdetermined signals. In this case, the knowledge about the instruments can be used to create instrument-speciﬁc rules to improve the quality of the sound source separation. Early work in the area was mainly devoted to the identiﬁ- cation of instruments in monophonic signals. This problem is, in general, less challenging than the polyphonic case, since the Manuscript received September 04, 2009; revised December 09, 2009. Date of publication March 11, 2010; date of current version October 01, 2010. This work was supported by Foreign Affairs and International Trade Canada under a Post-Doctoral Research Fellowship Program (PDRF). The associate editor co- ordinating the review of this manuscript and approving it for publication was Dr. Dan Ellis. J. G. A. Barbedo was with the Department of Computer Science, University of Victoria, Victoria, BC V8W 3P6 , Canada . He is now with the Department of Communications, FEEC, UNICAMP C.P. 6101, CEP: 13.083-852, Campinas, SP, Brazil (e-mail: jgab@decom.fee.unicamp.br). G. Tzanetakis is with the Department of Computer Science, University of Victoria, Victoria, BC V8W 3P6 , Canada (e-mail: gtzan@cs.uvic.ca). Digital Object Identiﬁer 10.1109/TASL.2010.2045186 instrument to be classiﬁed is isolated from the interference of any other sound source. Most of those proposals deal with gen- eral instruments [1]–[11], while a few others deal with speciﬁc cases, like classiﬁcation of woodwinds [12], [13] and discrimi- nation between piano and guitar [14]. In the last years, a number of strategies capable of dealing with polyphonic musical signals have been proposed. Most of them have some important limitations. — Limited number of instruments: some of the methods pro- posed in the literature only work and/or were only tested for a small (six or less) set of instruments (e.g., [15]–[22]). — Low accuracy: in some cases the accuracy is below 50% even considering few instruments (e.g., [19], [23]). — Instrument combinations set a priori: in this case, the strategies try to classify the signals according to prede- ﬁned combinations of instruments; hence, they fail if the mixture has a combination of instruments that was not considered in the training (e.g., [24], [25]). — Polyphony limited to duets: some strategies can only deal with two simultaneous instruments (e.g., [26], [27]). Thus, despite the clear advancements achieved in the last years, there are still many limitations that prevent instrument identiﬁcation tools to be more widely used. This paper presents a simple and reliable strategy to identify instruments in poly- phonic musical signals that overcomes some of the main limi- tations faced by its predecessors. The identiﬁcation uses a ma- jority decision based upon pairwise comparisons of instrument likelihoods. A related but more complex approach was used by Essid et al. [5] to classify solo musical phrases. The method presented here is basically a system in which majority rules are successively applied, as brieﬂy described in the following. In real musical signals, simultaneous sources (instruments and vocals) normally have a high degree of correlation and overlap both in time and frequency, as a result of the underlying rules normally followed by western music (e.g., notes with integer ratios of pitch intervals). This can make the identiﬁ- cation of instruments challenging. However, one can expect to ﬁnd at least some unaffected partials throughout the signal, which can be explored to provide cues about the corresponding instrument. As a result of such an observation, the proposed al- gorithm extracts features individually for each partial that does not collide with any other partial (isolated partials). Each pair of instruments is characterized by a particular set of nine features, selected from a complete set of 34 features. Each partial is identiﬁed with one of the pair of instruments using a linear classiﬁer. If such a feature is greater than a given threshold, it represents a certain instrument; otherwise, it represents the other one. A ﬁrst majority rule is applied by summarizing the results of the nine features; as a result, each pair of instruments 1558-7916/$26.00 © 2010 IEEE