Vol.: (0123456789)
1 3
Aquat Ecol
https://doi.org/10.1007/s10452-022-09967-5
Out of the shadows: automatic fsh detection from acoustic
cameras
R. M. Connolly · K. I. Jinks · A. Shand ·
M. D. Taylor · T. F. Gaston · A. Becker ·
E. L. Jinks
Received: 28 November 2021 / Accepted: 4 May 2022
© The Author(s) 2022
identifcation sonar (DIDSON) dataset. We compared
three types of detections, direct acoustic, acoustic
shadows, and a combination of direct and shad-
ows. The deep learning model was highly reliable at
detecting fsh to obtain abundance data using acous-
tic data. Model accuracy for counts-per-image was
improved by the inclusion of shadows (F1 scores, a
measure of the model accuracy: direct 0.79, shadow
0.88, combined 0.90). Model accuracy for MaxN per
video was high for all three types of detections (F1
scores: direct 0.90, shadow 0.90, combined 0.91). Our
results demonstrate that CNNs are a powerful tool for
automating underwater acoustic data analysis. Given
this promise, we suggest broadening the scope of test-
ing to include a wider range of fsh shapes, sizes, and
abundances, with a view to automating species (or
‘morphospecies’) identifcation and counts.
Keywords Acoustic camera · Deep learning ·
DIDSON · Estuary · Fish · Sonar
Introduction
Monitoring of fsh stocks across a wide range of envi-
ronments is a critical task for efective management.
Fisheries scientists and managers monitor fsh stocks
by collecting data on population abundance, bio-
mass and densities (Egerton et al. 2018; Smith et al.
2021), schooling behaviours (Trenkel et al. 2011),
predator–prey relationships (Becker and Suthers
Abstract Efcacious monitoring of fsh stocks is
critical for efcient management. Multibeam acous-
tic cameras, that use sound-refectance to generate
moving pictures, provide an important alternative to
traditional video-based methods that are inoperable
in turbid waters. However, acoustic cameras, like
standard video monitoring methods, produce large
volumes of imagery from which it is time consum-
ing and costly to extract data manually. Deep learn-
ing, a form of machine learning, can be used to auto-
mate the processing and analysis of acoustic data. We
used convolutional neural networks (CNNs) to detect
and count fsh in a publicly available dual-frequency
Handling Editor: Sébastien Villeger.
Supplementary Information The online version
contains supplementary material available at https://doi.
org/10.1007/s10452-022-09967-5.
R. M. Connolly (*) · K. I. Jinks · A. Shand · E. L. Jinks
Coastal and Marine Research Centre, School
of Environment and Science, Australian Rivers Institute,
Grifth University, Gold Coast, QLD 4222, Australia
e-mail: r.connolly@grifth.edu.au
M. D. Taylor · A. Becker
Port Stephens Fisheries Institute, New South Wales
Department of Primary Industries, Taylors Beach,
NSW 2315, Australia
T. F. Gaston
School of Environment and Life Sciences, University
of Newcastle, Ourimbah, NSW 2258, Australia