Ecological Informatics 71 (2022) 101766 Available online 8 August 2022 1574-9541/© 2022 Elsevier B.V. All rights reserved. Detection of baleen whale species using kernel dynamic mode decomposition-based feature extraction with a hidden Markov model A.M. Usman * , D.J.J. Versfeld Department of Electrical and Electronic Engineering, Stellenbosch University, Stellenbosch, South Africa A R T I C L E INFO Keywords: Baleen whales Detection DMD Eigendecomposition Error rate Feature extraction HMM Kernel DMD Precision True positive rate ABSTRACT The negative effects of human activities within the ecological space of whales remains an issue of concern to marine ecologists. The accurate detection and subsequent classification of whale species are vital in mitigating these negative effects. Automatic detection techniques have come in handy for the efficient detection of the various whale species without human error. Hidden Markov model (HMM) remains one the most efficient de- tectors of whale species. However, its performance efficiency is greatly influenced by the feature vectors adapted with it. In this work, we propose the use of the kernel dynamic mode decomposition (kDMD) algorithm as a tool to extract features of baleen whale species, which are then adapted with HMM for their detection. Dynamic mode decomposition (DMD) is an eigendecomposition-based algorithm that is capable of extracting latent underlying features of non-linear signals such as those vocalised by whales. However, the underlying cost of DMD is the singular value decomposition (SVD), which adds significant complexity to the modes derivation steps. Thus, this work is introducing the kernel method into the DMD, in order to find a more efficient way of computing DMD without explicitly using the SVD algorithm. Furthermore, the feature formation steps in the original DMD was modified (mDMD) in this work, to make it more generic for datasets with sparse whale sound samples. The performance of the detectors was tested on datasets containing sounds of southern right whales (SRWs) and humpback whales. The results obtained show a high true positive rate (TPR), high precision (PREC) and low error rate (ERR) for both species. The performance of the three DMD-based feature-extraction methods were compared. The kDMD-HMM generally performed better than the mDMD-HMM and DMD-HMM detectors. The methods proposed here can be tailored for the automatic detection and classification of other vocalising animal species through their sounds. 1. Introduction Whale species, a suborder of the cetacean taxonomy, are among the marine mammals that are facing threats to their existence within their ecological space. These threats result from the effects of human activities such as shipping, marine exploration, geographical seismic surveys, commercial whaling, and naval sonar actions, as well as climate change effects Usman et al. (2020). Whale species are of concern to the general public and marine ecology managers because of their significance to the economics of the tourism sector, as well as the increasing understanding of their value in maintaining healthy aquatic ecosystems Jefferson et al. (2011). Hence, they have continued to gain the attention of researchers, who have been proposing various solutions to mitigate these threats. These solutions, which are based on ecology informatics studies of the species, include reliable estimations of their population density Marques et al. (2013), measurements of range and seasonal occurrence, and de- terminations of their population structures Zimmer (2011). The accurate detection of whale species and their subsequent classification are central to helping marine ecologists propose the solutions highlighted above, and also providing better understanding of their ecology. Passive acoustic monitoring (PAM) is one of the sources of ecological information on whale species. PAM has been proven to be an effective way to observe whales whilst remaining unobtrusive, hence becoming an important method for data gathering Usman et al. (2020). The detection and classification can be done manually by simple observation of spectrograms of the recorded whale sounds, or by expert marine ecologists listening to these sounds Putland et al. (2018). However, large volumes of data are often gathered during the PAM process, which can run for weeks, months or even years. Thus, manual analysis of the data is difficult and prone to human error. As a result, different automatic * Corresponding author. E-mail address: ayinde.mohammed@yahoo.co.uk (A.M. Usman). Contents lists available at ScienceDirect Ecological Informatics journal homepage: www.elsevier.com/locate/ecolinf https://doi.org/10.1016/j.ecoinf.2022.101766 Received 19 May 2022; Received in revised form 2 August 2022; Accepted 3 August 2022