Underwater Transient and Non Transient Signals Classiﬁcation Using Predictive Neural Networks Yan Guo and Bruno Gas Abstract— The project ASAROME (Autonomous SAiling Robot for Oceanographic MEasurements) is working on a small autonomous sailboat in order to make measurements and observations in the marine environment for long periods. In this project, perception plays an important role by giving an estimate of the speed of surface winds, the state of the sea surface and the rate of precipitation in wet weather. In this paper, the unknown signals are ﬁrst encoded with different codes (ERB, MFCC, LPC, LPCC). Then the coded signals are modeled by two different methods of classiﬁcation: predictive and k-Nearest Neighbor. The ﬁnal part of the system uses local and global decision to recognize the class of the unknown signal. Experiments are conducted to compare the results obtained by different encodings. Our results show that MFCC does not represent the ideal approach for the recognition of underwater audio signals, but LPCC seems to be a better candidate. I. INTRODUCTION ASAROME (Autonomous SAiling Robot for Oceano- graphic MEasurements) is a research project focused on autonomous robotics. The project aims to prove the relevance of using sailing autonomous surface vehicles (ASV) for long (several weeks) observation and measurement missions in marine environments. Based on a robotized sailing boat concept from Robosoft, the ASAROME project focuses on adding and integrating advanced functionalities in the ﬁelds of aero and hydrodynamics modeling, as well as action/perception in robotics, to build a sailed autonomous surface vehicle demonstrator. One of the tasks of the project is the multiperception coupling task which gathers the following detection meth- ods: panoramic vision, radar, inertial and gyro sensors. It will be used for detecting obstacles (boats, drifting ﬂoating bodies) and for estimating the sea state (wave direction and amplitude). In the ﬁeld of perception, most of the literature to date relating to the detection of obstacles at sea concerns the prob- lem of tracking and monitoring of appropriate paths in order to avoid collision situations. The anticollision maneuvers are mainly based on the route of radar echoes observed on moving objects. The design of the ARPA system [1], initiated in the early 80s, had the primary purpose of the automation of obstacle monitoring and planning safe trajectories. It continues today with the introduction of artiﬁcial intelligence tools [2]. For example, we found with the Syllogic sailing lab [3] the implementation of predictive algorithms to predict the relative height and direction of nearby waves, from the UPMC Univ Paris 06, UMR 7222, F-75005, Paris, France, 4 Place Jussieu, BP 173, 75252 Paris Cedex 05, FRANCE. This work is funded under the project ANR ASAROME (Num. ANR-07-ROBO-0009) Guo@isir.fr, Bruno.Gas@upmc.fr fusion of data with the sensors placed in the boat (a measure of the strength and direction of winds, accelerometers, etc.). These studies work on correlating the data related to the state of the sea and wind with data as detected on the boat. They do not use visual and/or audio sensor data. We are speciﬁcally interested in the data resulting from underwater sound sensors with the objective to detect near and far motor vehicles. In this context, we propose a comparative study of coding and classiﬁcation algorithms commonly used in the audio ﬁeld for the classiﬁcation of underwater sound events (noise related to weather conditions, the maritime trafﬁc or the proximity of marine animals, etc.). Lim et al. [4], [5], have recently shown that it was possible to classify underwater transient sound events by the Mel Frequency Cepstral Coefﬁcients (MFCC) features of acoustic frames. They proposed a classiﬁcation of feature vectors by comparing Euclidean distances (k-NN), or by learning of a Multilayer Perceptron (MLP). We propose in this article to extend the work of Lim et al. in the following way: • extension to the case of signals non transient or long- term, i.e. whose characteristics vary slightly during time. • study of other coding methods (Linear predictive coding (LPC), Linear prediction cepstral coefﬁcents (LPCC) and Equivalent Rectangular Bandwidth (ERB)); • classiﬁcation by non-linear predictive modelling; • more signal classes (13 to 30 instead of 8 are used in [4], [5]). In the ﬁrst part of our paper, we describe the signal coding algorithms. In the second part, we discuss the classiﬁcation of underwater signals. The third section of this article contains the description of experiments and the analysis of the results obtained. II. THE COMPOSITION OF THE SYSTEM A. Representation of signals For the classiﬁcation problem covered in this paper, we suppose that the shape, duration and spectral response of the signal are not known. The spectral features may be highly variable in time, which has led many authors to use time- frequency representation (wavelet transform [6], Wigner- Ville distribution and Cross Wigner-Ville distribution [7], [8], or short-time Fourier transform (STFT)) methods. Tucker and Brown [9] proposed another idea to classify underwater transient signals recorded by passive sonar. They propose to consider perceptual acoustic features, i.e. those which contain information that human listeners are likely to use in transient classiﬁcation tasks. The 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems October 11-15, 2009 St. Louis, USA 978-1-4244-3804-4/09/$25.00 ©2009 IEEE 2283