Improving seabed classification from Multi-Beam Echo Sounder (MBES) backscatter data with visual data mining Kazi Ishtiak Ahmed & Urška Demšar Received: 6 February 2013 / Revised: 20 April 2013 / Accepted: 23 April 2013 # Springer Science+Business Media Dordrecht 2013 Abstract Multi-Beam Echo Sounders are often used for clas- sification of seabed type, as there exists a strong link between sonar backscatter and sediment characteristics of the seabed. Most of the methods for seabed classification from MBES backscatter create a highly-dimensional data set of statistical features and then use a combination of Principal Component Analysis and k-means clustering to derive classes. This pro- cedure can be time consuming for contemporary large MBES data sets with millions of records. This paper examines the complexity of one of most commonly used classification approaches and suggests an alternative where feature data set is optimised in terms of dimensionality using computa- tional and visual data mining. Both the original and the optimised method are tested on an MBES backscatter data set and validated against ground truth. The study found that the optimised method improves accuracy of classifi- cation and reduced complexity of processing. This is an encouraging result, which shows that bringing together methods from acoustic classification, visual data mining, spatial analysis and remote sensing can support the un- precedented increases in data volumes collected by con- temporary acoustic sensors. Keywords Multi-Beam Echo Sounder . Seabed classification . Visual data mining . Self-Organising Maps . Cluster validation . Mapping accuracy Introduction Oceans cover 70 % of Earth’ s surface yet our understanding of their waters to date is quite minimal. Historically, the main reason for this was the unavailability of equipment for ocean exploration. Since the Second World War, with the realisa- tion of the importance of ocean exploration, a large portion of research has focused on the technological development of underwater exploration equipment. Direct results of that research are today’ s sophisticated high-resolution acoustic echo sounders (Mayer 2006). But not as much effort has been given on updating the methods used to process acoustic data collected by these advanced sensors. To this day, the majority of the acoustic data processing methods are based on those used in the World War II era. These methods are not efficient in dealing with large volume of high resolution acoustic data produced by the modern sensors and as a result, creating quality seabed maps from these data is difficult, time consuming, and costly. This drawback in acoustic data pro- cessing is the motivation for an alternative approach pro- posed in this paper. Mapping seabed type is important in a variety of applica- tions, such as environmental research, management of marine and coastal resources and oil and gas exploration. Tools used for this purpose are acoustic sonar systems, which transmit and receive an acoustic pulse from a device on a survey vessel (Huges Clarke et al. 1997; Mayer 2006). Data collected con- sist of the travel time of the acoustic pulse to the sea floor and back and the strength of the signals. From this, measurements such as the depth to the seafloor (bathymetry), depth to sub- surface sediment layers (sub-bottom), and the reflectance of the sea floor (intensity of backscattered energy or backscatter) can be derived. There is a strong link between acoustic back- scatter and sediment characteristics of the seabed (Brown and Blondel 2009; Goff et al. 2004; Huges Clarke et al. 1997), therefore such data are often used for seabed mapping. K. I. Ahmed (*) Health Informatics Institute, Algoma University, 1520 Queen Street East, Sault Ste. Marie, ON P6A 2G4, Canada e-mail: ishtiak.ahmed@algomau.ca U. Demšar Centre for Geoinformatics, School of Geography & Geosciences, University of St Andrews, St Andrews, UK e-mail: urska.demsar@st-andrews.ac.uk J Coast Conserv DOI 10.1007/s11852-013-0254-3