Bird-Species Audio Identification, Ensembling 1D + 2D Signals Gyanendra Das 1 , Saksham Aggarwal 1 1 Indian Institue of Technology, Dhanbad, India Abstract In this paper, a method for recognizing bird species in audio recordings is described. we have experimented with 4 diferent approaches. Model on Spectrograms and Waveform domain consists of two main models: 1) A binary classifer for predicting if bird call is present in the audio or not; 2) A multiclass classifer for predicting which bird is present. Combining these two approaches, 1D and 2D signals, gives strong results. We also experiment on ATDemucs which extends Demucs , replacing the BiLSTM with self-attention. In this approach, we frst do source separation of multiple birds along with noise separation as Universal Source Separation. Then we classify each source, both using a 1D waveform model ReSE-Multi, with self-attention and a 2D spectrogram model. We also discuss how we handle diferent thresholds for diferent models by a postprocessing technique. Ensembling techniques like Voting, Scaling and Direct Averaging gave us a good boost in our results. Our combined architecture including 1D and 2D signals achieves 0.6179 micro-averaged F1 in the task that asked for classifcation of 397 bird species. Keywords Deep Learning, Bird Species Classifcation, Transfer Learning, Attention Mechanism, Sound Detection, Audio Source Detection, Demucs, Resnet 50, Efcient Net, Ensembling, Multi Domain Meta Training 1. Introduction There are about 10,000 diferent bird species in this world, and they all play an important role in the natural world. They serve as good indicators of declining habitat quality and pollution. It is often easier to hear birds than it is to see them. BirdCLEF 2021[1] - Birdcall Identifcation is a Kaggle competition organized by The Cornell Lab of Ornithology in collaboration with LifeCLEF 2021[1] whose challenge is to identify which birds are calling in long recordings, given training data generated in meaningfully diferent contexts. This paper is structured in a way that it frst gives details of the competition and the given data so that there is a clear understanding of the challenges posed by the train and test data. Also, we will provide a detailed solution to the approaches we have used for this challenge including data preparation, augmentations, model building, training procedure, and post-processing techniques. CLEF 2021 – Conference and Labs of the Evaluation Forum, September 21–24, 2021, Bucharest, Romania  gyanendralucky9337@gmail.com (G. Das); sakshamaggarwal20@gmail.com (S. Aggarwal)  https://luckygyana.github.io/Portfolio/ (G. Das); https://github.com/saksham20aggarwal (S. Aggarwal) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN1613-0073 CEUR Workshop Proceedings (CEUR-WS.org)