On Using Prefiltration in HMM-Based Bird Species Recognition Robert Wielgat Department of Technology, Higher State Vocational School in Tarnów, Mickiewicza 8, 33-100 Tarnów, Poland, e-mail: rwielgat@poczta.onet.pl Tomasz Potempa Department of Technology, Higher State Vocational School in Tarnów, Mickiewicza 8, 33-100 Tarnów, Poland, e-mail: t_potempa@pwsztar.edu.pl Paweł Świętojański Department of Technology, Higher State Vocational School in Tarnów, Mickiewicza 8, 33-100 Tarnów, Poland, e-mail: p.swietojanski@gmail.com Daniel Król Department of Technology, Higher State Vocational School in Tarnów, Mickiewicza 8, 33-100 Tarnów, Poland, e-mail: danielkrol@poczta.onet.pl Abstract - Automatic bird species recognition method using their voices is presented in this paper. The selected bird species have been detected by hidden Markov models (HMM) classifier using Mel-frequency cepstral coefficients (MFCC). In order to support recognition process, analysed signals have been appropriately filtered before classification in the so called prefiltration process. The prefiltration strategy assumed using n-th order IIR Butterworth filter bank. Each filter from the filter bank was applied for band pass filtration in the bird species- specific and signal type band. Increase of recognition accuracy has been observed in case of prefiltration with properly chosen filter order. Experiments have been carried out on the set of bird voices containing 30 bird species, one of which is endangered with extinction. I. INTRODUCTION Protection of bird species endangered with extinction is one of the most crucial issues of modern environmental protection. The problem as well as the potential solution is in its principle based on the simple, yet technically challenging question “which bird species are endangered with extinction?”. The practice unveils that the answer may involve substantial amount of various interdisciplinary resources from expensive and time-consuming expeditions (which usually require considerable human resources) to tedious manual data analyses. Some remedy for this situation can be a database system capable to automatically collect appropriate information about existence area of particular bird species using automatic recognition of sounds belonging to particular bird species. Some research in the area of automatic bird species detection and automatic bird’s sounds recognition in general has been already pursued. First attempts were made by Anderson et al. [1] and Kogan and Margolish [2] who applied dynamic time warping (DTW) and hidden Markov models (HMM) to classify two bird species, Taeniopygia guatta and Passerina cyanea. Conception of a complete monitoring system is presented in [3] where Kwan et al. used to this purpose Gaussian Mixture Models. They also applied advanced beamforming technique. Another conceptual approach of bird species recognition system was presented in [4] where author examined usability of support vector machines (SVM) in bird species classification. An example application of artificial neural networks in the considered discipline is presented in the position [5]. The cited papers as well as the authors' previous experience [6] confirm that statistical framework based on Gaussian mixture models is a suitable tool for building acoustic models representing bird sounds, for which reason the research described in this paper takes advantage of it. As the features in bird recognition problem MFCC [3-5,8] parameters are commonly used. Among other feature extractors, cepstral coefficients (CC) [7] and human-factor cepstral coefficients (HFCC) [8] are also encountered. It should be noted however that their optimality in representation of bird sounds is still an open research question. In the current investigation MFCC coefficients have been applied. Beside of bird acoustical signal features, proper signal pre- processing should be also applied. Because emitted bird voices significantly differ in spectral structure, application of the filtration in band specific for bird species seems to be an intuitive choice. Some attempt to analyse the problem can be found in [3,4]. This issue was also investigated by the authors of this paper and the results unveiled that deletion of higher frequency bands in HFCC features increased recognition accuracy about 5% [8]. In the presented paper more complex filtration methodology using IIR Butterworth filter has been presented. The main clue of the scenario assumes that filtration is carried out in the frequency bands specific for bird species and signal types. This paper is organised as follows: the second section contains general information about research methodology and presents some introductory assumptions. In the section three detailed dataset and experiments description and results are given. Fourth section is intended for discussion which is directly followed by conclusions. II. PROBLEM STATEMENT Signals containing voices of particular bird species recorded in natural (real) environment are very often corrupted by low and high-frequency noise as well as by other bird voices that may infer with the recognized one. One of the method which allow us to minimise these effects can be filtration of unknown bird voice in all the band specific for detected bird species. We call this a prefiltering stage and its general idea in connection with recognition process is depicted in a Fig. 1. HMM based recognition system with prefiltration in general operate in two modes. The first one aims to train HMMs and consists of the following steps. At the beginning bird voices from the training set are prefiltered in species-specific frequency bands. Afterwards, signal is divided into feature vectors e.g. MFCC feature vectors. Parameters of HMMs are estimated in a traditional way - Viterbi initialization and Baum-Welch [9] algorithms are carried out after the feature extraction stage. The training process yields set of HMMs: one model per each bird species (each frequency band).