Underwater Transient and Non Transient Signals Classification Using
Predictive Neural Networks
Yan Guo and Bruno Gas
Abstract— The project ASAROME (Autonomous SAiling
Robot for Oceanographic MEasurements) is working on a
small autonomous sailboat in order to make measurements and
observations in the marine environment for long periods. In
this project, perception plays an important role by giving an
estimate of the speed of surface winds, the state of the sea
surface and the rate of precipitation in wet weather. In this
paper, the unknown signals are first encoded with different
codes (ERB, MFCC, LPC, LPCC). Then the coded signals are
modeled by two different methods of classification: predictive
and k-Nearest Neighbor. The final part of the system uses local
and global decision to recognize the class of the unknown signal.
Experiments are conducted to compare the results obtained by
different encodings. Our results show that MFCC does not
represent the ideal approach for the recognition of underwater
audio signals, but LPCC seems to be a better candidate.
I. INTRODUCTION
ASAROME (Autonomous SAiling Robot for Oceano-
graphic MEasurements) is a research project focused on
autonomous robotics. The project aims to prove the relevance
of using sailing autonomous surface vehicles (ASV) for
long (several weeks) observation and measurement missions
in marine environments. Based on a robotized sailing boat
concept from Robosoft, the ASAROME project focuses
on adding and integrating advanced functionalities in the
fields of aero and hydrodynamics modeling, as well as
action/perception in robotics, to build a sailed autonomous
surface vehicle demonstrator.
One of the tasks of the project is the multiperception
coupling task which gathers the following detection meth-
ods: panoramic vision, radar, inertial and gyro sensors. It
will be used for detecting obstacles (boats, drifting floating
bodies) and for estimating the sea state (wave direction and
amplitude).
In the field of perception, most of the literature to date
relating to the detection of obstacles at sea concerns the prob-
lem of tracking and monitoring of appropriate paths in order
to avoid collision situations. The anticollision maneuvers
are mainly based on the route of radar echoes observed on
moving objects. The design of the ARPA system [1], initiated
in the early 80s, had the primary purpose of the automation
of obstacle monitoring and planning safe trajectories. It
continues today with the introduction of artificial intelligence
tools [2]. For example, we found with the Syllogic sailing
lab [3] the implementation of predictive algorithms to predict
the relative height and direction of nearby waves, from the
UPMC Univ Paris 06, UMR 7222, F-75005, Paris, France, 4 Place
Jussieu, BP 173, 75252 Paris Cedex 05, FRANCE. This work is
funded under the project ANR ASAROME (Num. ANR-07-ROBO-0009)
Guo@isir.fr, Bruno.Gas@upmc.fr
fusion of data with the sensors placed in the boat (a measure
of the strength and direction of winds, accelerometers, etc.).
These studies work on correlating the data related to the state
of the sea and wind with data as detected on the boat. They
do not use visual and/or audio sensor data.
We are specifically interested in the data resulting from
underwater sound sensors with the objective to detect near
and far motor vehicles. In this context, we propose a
comparative study of coding and classification algorithms
commonly used in the audio field for the classification of
underwater sound events (noise related to weather conditions,
the maritime traffic or the proximity of marine animals, etc.).
Lim et al. [4], [5], have recently shown that it was possible
to classify underwater transient sound events by the Mel
Frequency Cepstral Coefficients (MFCC) features of acoustic
frames. They proposed a classification of feature vectors by
comparing Euclidean distances (k-NN), or by learning of a
Multilayer Perceptron (MLP). We propose in this article to
extend the work of Lim et al. in the following way:
• extension to the case of signals non transient or long-
term, i.e. whose characteristics vary slightly during
time.
• study of other coding methods (Linear predictive coding
(LPC), Linear prediction cepstral coefficents (LPCC)
and Equivalent Rectangular Bandwidth (ERB));
• classification by non-linear predictive modelling;
• more signal classes (13 to 30 instead of 8 are used in
[4], [5]).
In the first part of our paper, we describe the signal coding
algorithms. In the second part, we discuss the classification of
underwater signals. The third section of this article contains
the description of experiments and the analysis of the results
obtained.
II. THE COMPOSITION OF THE SYSTEM
A. Representation of signals
For the classification problem covered in this paper, we
suppose that the shape, duration and spectral response of the
signal are not known. The spectral features may be highly
variable in time, which has led many authors to use time-
frequency representation (wavelet transform [6], Wigner-
Ville distribution and Cross Wigner-Ville distribution [7], [8],
or short-time Fourier transform (STFT)) methods.
Tucker and Brown [9] proposed another idea to classify
underwater transient signals recorded by passive sonar. They
propose to consider perceptual acoustic features, i.e. those
which contain information that human listeners are likely to
use in transient classification tasks.
The 2009 IEEE/RSJ International Conference on
Intelligent Robots and Systems
October 11-15, 2009 St. Louis, USA
978-1-4244-3804-4/09/$25.00 ©2009 IEEE 2283