REVISTA DO DETUA, VOL. 3, Nº 3, JANEIRO 2001 Resumo – Este trabalho é acerca da determinação de pitch em sinais musicais monofónicos produzidos por uma fonte vocal. O Algoritmo de Detecção de Pitch (PDA) baseia-se na função de autocorrelação, um dos mais explorados métodos de detecção da frequência fundamental. Contudo, é desenvolvida uma nova aproximação ao processo de estimação de pitch. Esta nova estratégia consiste na introdução de uma unidade interactiva de processamento lógico que aumenta a cooperação entre o extractor central e o bloco de pós-processamento (dois dos três blocos que caracterizam a maior parte dos PDAs), de forma a evitar estimativas erradas de pitch. Abstract - This work is concerned with pitch * determination in vocal music monophonic signals. The proposed Pitch Detection Algorithm (PDA) is based on the autocorrelation function, one of the most explored fundamental frequency detection methods. However, a new approach to the estimation process is developed. This new strategy consists in the introduction of a new logic processing interaction unit that enhances the co-operation between the central extractor and postprocessor blocks (two of the three blocks that characterise most PDAs), in order to avoid erroneous pitch estimates. I. INTRODUCTION The problem of estimating the fundamental frequency of speech signals occupies a key position in the signal processing research area. It has many potential applications in different areas such as transmission, synthesis and recognition of speech, and plays the leading role in systems for helping to correct speech impediments of the handicapped [1]. Pitch determination of speech signals is not a simple task. The arduousness of this operation arises from the non-stationary nature of the speech waveform. This paper will however focus on the application of pitch estimation to vocal music processing, an area that has been relatively little explored in comparison to the massive investigation carried out by the speech community. However, an automatic pitch detector, capable of extracting the fundamental frequency from * Although there is a psychoacoustical distinction between “pitch” as a perceived quantity and “fundamental frequency” as a physical quantity, in this paper, these terms are used indistinctly in reference to the fundamental frequency of voice and the measurement unity used is Hz. singing voices would have many interesting applications, such as: systems for computer-assisted singing teaching or ear-training, automatic score transcription, analysis of microtonal non-Western music, real-time control of MIDI devices, etc. [2]. The main differences of vocal music signals, in comparison to speech signals, are related to the wider range of fundamental frequency (from ≈82.4Hz to ≈987.7Hz)[6], and the enormous variations in timbre (and therefore in spectral content) that a singer can produce in a single piece of music. These aspects should be carefully regarded in order to develop a reliable method of extracting the fundamental frequency of vocal music signals. In this work we present a Pitch Detection Algorithm based on the autocorrelation function[3]. Some modifications are introduced in the PDA structure in order to increase its performance, reliability and accuracy. The software implementation of this PDA was developed in MATLAB. We begin with a brief description of the structure of general PDAs, and the problems associated with this kind of approach. Then, the basic characteristics of the pitch estimation method of this PDA, the autocorrelation function, are described. Afterwards, we approach the implementation of the proposed PDA and finally, we present and analyse the results obtained with the analysis of synthesized, samples and real signals. II. PITCH DETECTION ALGORITHMS Most of the PDAs are characterised by the following blocks: the pre-processor, the central extractor and the postprocessor [1]. The central extractor performs the main task: it converts the input signal into a series of pitch estimates. The task of the pre-processor is data reduction and enhancement in order to facilitate the operation of the central extractor. The postprocessor operates in a more application-oriented way. Some of its typical tasks are error correction, smoothing the pitch contour and refining the pitch estimation. The main problems with this structure occur when in presence of more complex situations, such as voicing transitions. In our point of view, this is due to a lack of an effective interaction between the central extractor and the postprocessor, since in this model, the pitch estimation and its correction are made separately. Pitch Detection in Vocal Music Monophonic Signals António Sá Pinto, Ana Maria Tomé