Vol.: (0123456789) 1 3 Aquat Ecol https://doi.org/10.1007/s10452-022-09967-5 Out of the shadows: automatic fsh detection from acoustic cameras R. M. Connolly  · K. I. Jinks  · A. Shand · M. D. Taylor  · T. F. Gaston  · A. Becker  · E. L. Jinks Received: 28 November 2021 / Accepted: 4 May 2022 © The Author(s) 2022 identifcation sonar (DIDSON) dataset. We compared three types of detections, direct acoustic, acoustic shadows, and a combination of direct and shad- ows. The deep learning model was highly reliable at detecting fsh to obtain abundance data using acous- tic data. Model accuracy for counts-per-image was improved by the inclusion of shadows (F1 scores, a measure of the model accuracy: direct 0.79, shadow 0.88, combined 0.90). Model accuracy for MaxN per video was high for all three types of detections (F1 scores: direct 0.90, shadow 0.90, combined 0.91). Our results demonstrate that CNNs are a powerful tool for automating underwater acoustic data analysis. Given this promise, we suggest broadening the scope of test- ing to include a wider range of fsh shapes, sizes, and abundances, with a view to automating species (or ‘morphospecies’) identifcation and counts. Keywords Acoustic camera · Deep learning · DIDSON · Estuary · Fish · Sonar Introduction Monitoring of fsh stocks across a wide range of envi- ronments is a critical task for efective management. Fisheries scientists and managers monitor fsh stocks by collecting data on population abundance, bio- mass and densities (Egerton et al. 2018; Smith et al. 2021), schooling behaviours (Trenkel et al. 2011), predator–prey relationships (Becker and Suthers Abstract Efcacious monitoring of fsh stocks is critical for efcient management. Multibeam acous- tic cameras, that use sound-refectance to generate moving pictures, provide an important alternative to traditional video-based methods that are inoperable in turbid waters. However, acoustic cameras, like standard video monitoring methods, produce large volumes of imagery from which it is time consum- ing and costly to extract data manually. Deep learn- ing, a form of machine learning, can be used to auto- mate the processing and analysis of acoustic data. We used convolutional neural networks (CNNs) to detect and count fsh in a publicly available dual-frequency Handling Editor: Sébastien Villeger. Supplementary Information The online version contains supplementary material available at https://doi. org/10.1007/s10452-022-09967-5. R. M. Connolly (*) · K. I. Jinks · A. Shand · E. L. Jinks  Coastal and Marine Research Centre, School of Environment and Science, Australian Rivers Institute, Grifth University, Gold Coast, QLD 4222, Australia e-mail: r.connolly@grifth.edu.au M. D. Taylor · A. Becker  Port Stephens Fisheries Institute, New South Wales Department of Primary Industries, Taylors Beach, NSW 2315, Australia T. F. Gaston  School of Environment and Life Sciences, University of Newcastle, Ourimbah, NSW 2258, Australia