Chinese White Dolphin Detection in the Wild
Hao Zhang
zhanghaoinf@gmail.com
Department of Computer Science,
City University of Hong Kong
Hong Kong SAR, China
Qi Zhang
qzhang364-c@my.cityu.edu.hk
Department of Computer Science,
City University of Hong Kong
Hong Kong SAR, China
Phuong Anh Nguyen
panguyen2@cityu.edu.hk
Department of Computer Science,
City University of Hong Kong
Hong Kong SAR, China
Victor Lee
csvlee@eee.hku.hk
Department of Electrical and
Electronic Engineering, The
University of Hong Kong
Hong Kong SAR, China
Antoni B. Chan
abchan@cityu.edu.hk
Department of Computer Science,
City University of Hong Kong
Hong Kong SAR, China
ABSTRACT
For ecological protection of the ocean, biologists usually conduct
line-transect vessel surveys to measure sea species’ population den-
sity within their habitat (such as dolphins). However, sea species
observation via vessel surveys consumes a lot of manpower re-
sources and is more challenging compared to observing common
objects, due to the scarcity of the object in the wild, tiny-size of
the objects, and similar-sized distracter objects (e.g., floating trash).
To reduce the human experts’ workload and improve the obser-
vation accuracy, in this paper, we develop a practical system to
detect Chinese White Dolphins in the wild automatically. First,
we construct a dataset named Dolphin-14k with more than 2.6k
dolphin instances. To improve the dataset annotation efficiency
caused by the rarity of dolphins, we design an interactive dolphin
box annotation strategy to annotate sparse dolphin instances in
long videos efficiently. Second, we compare the performance and
efficiency of three off-the-shelf object detection algorithms, includ-
ing Faster-RCNN, FCOS, and YoloV5, on the Dolphin-14k dataset
and pick YoloV5 as the detector, where a new category (Distracter)
is added to the model training to reject the false positives. Finally,
we incorporate the dolphin detector into a system prototype, which
detects dolphins in video frames at 100.99 FPS per GPU with high
accuracy (i.e., 90.95 mAP@0.5).
CCS CONCEPTS
• Computing methodologies → Computer vision tasks; Scene
understanding; Vision for robotics; Neural networks.
KEYWORDS
datasets, neural networks, dolphin detection, detection system
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from permissions@acm.org.
MMAsia ’21, December 1–3, 2021, Gold Coast, Australia
© 2021 Association for Computing Machinery.
ACM ISBN 978-1-4503-8607-4/21/12. . . $15.00
https://doi.org/10.1145/3469877.3490574
ACM Reference Format:
Hao Zhang, Qi Zhang, Phuong Anh Nguyen, Victor Lee, and Antoni B. Chan.
2021. Chinese White Dolphin Detection in the Wild . In ACM Multimedia
Asia (MMAsia ’21), December 1–3, 2021, Gold Coast, Australia. ACM, New
York, NY, USA, 5 pages. https://doi.org/10.1145/3469877.3490574
1 INTRODUCTION
Large infrastructure constructions around the sea (airport, cross-
sea bridges, land reclamation, etc.) may cause disturbance to the
surrounding ecosystem. These disturbances - including noise, land
reshaping, and increasing water traffics - affect the distribution
and behavior of marine mammals (eg. dolphins) [12]. Therefore,
researchers have conducted vessel surveys (see Fig. 1a) to study the
impact of these construction disturbance on marine mammals.
However, the line-transect survey methodology [2] requires
much manpower, using 4 people in a group to take turns to observe
the sea with binoculars. Each observing split lasts 15 minutes and
requires two observers, one using binocular and one using unaided
eyes, to cover an angle of 180°field of view in front of the vessel
(see Fig. 1b). A survey trip typically requires 4-6 hours of non-stop
observing, depending on the survey area. This is labor demanding
work while the accuracy of the surveying results are not guaran-
teed because human eyes can just detect motion in a 160° field of
view, and they need to rest frequently in an observing period. For
those reasons, we propose developing a marine mammals detection
system to reinforce this surveying procedure to reduce human labor
and improve the surveying results. This study focuses on detecting
the Chinese White Dolphins (CWDs or dolphins in short).
Different from common objects detection problem, detecting
dolphins in the wild has several challenges:
• Scarcity. The dolphins are rarely witnessed by humans and
only appears on the water surface for around 1-2 seconds
each time.
• Small size. The recorded dolphins are of a tiny size (30 × 30
pixels in 1080p videos).
• Partially visible. Mostly, only partial body of the dolphins
can be observed (see Figure 3).
• Distracter objects. Distant objects, such as waves, sun glare,
debris, are visually similar to dolphins and should be distin-
guished to reduce false alarm. These objects are regarded as
distracter samples (or false positives).